# Allowed search engines directives User-agent: Mediapartners-Google User-agent: Googlebot User-agent: Googlebot-Image User-agent: Googlebot-Mobile User-agent: Googlebot-News User-agent: Googlebot-Video User-agent: Adsbot-Google User-agent: Twitterbot User-agent: Applebot User-agent: Bingbot User-agent: SiteAuditBot User-agent: Publication-Access-for-Facebook User-agent: facebookexternalhit User-agent: Flipboard User-agent: FlipboardProxy User-agent: upday # Crawl-delay: 10 #Sitemaps Sitemap: https://www.sudinfo.be/sitemap-correctif-si-0.xml Sitemap: https://www.sudinfo.be/sitemap-correctif-si-1.xml Sitemap: https://www.sudinfo.be/sites/default/files/sitemaps/sitemapnews-0.xml Sitemap: https://www.sudinfo.be/sites/default/files/sitemaps/sitemapindex.xml # Directories Disallow: /includes/ Disallow: /misc/ Disallow: /modules/ Disallow: /profiles/ Disallow: /scripts/ Disallow: /themes/ # Files Disallow: /CHANGELOG.txt Disallow: /cron.php Disallow: /INSTALL.mysql.txt Disallow: /INSTALL.pgsql.txt Disallow: /INSTALL.sqlite.txt Disallow: /install.php Disallow: /INSTALL.txt Disallow: /LICENSE.txt Disallow: /MAINTAINERS.txt Disallow: /update.php Disallow: /UPGRADE.txt Disallow: /xmlrpc.php # Paths (clean URLs) Disallow: /admin/ Disallow: /comment/reply/ Disallow: /filter/tips/ Disallow: /node/add/ Disallow: /search/ Disallow: /user/register/ Disallow: /user/password/ Disallow: /user/login/ Disallow: /user/logout/ # Paths (no clean URLs) Disallow: /?q=admin/ Disallow: /?q=comment/reply/ Disallow: /?q=filter/tips/ Disallow: /?q=node/add/ Disallow: /?q=search/ Disallow: /?q=user/password/ Disallow: /?q=user/register/ Disallow: /?q=user/login/ Disallow: /?q=user/logout/ Disallow: /*fb_comment_id= # Disallow General Path Disallow: /rossel.nuggad.net/ Disallow: /a.teads.tv/ Disallow: /javascripts/ Disallow: /cgi-mod/ Disallow: /shares_callback/ Disallow: /trackuity.immo.vlan.be/ Disallow: /81985301/ Disallow: /GPSLINK* Disallow: /check_cookies* Disallow: /phpincludes/ Disallow: /forward/ Disallow: /api/ Disallow: /reactions_callback/ Disallow: /programme-tv/* Disallow: /*/paywall?page=* Disallow: /term/*/paywall Disallow: /node/* Disallow: /taxonomy/term/*/article/ Disallow: /term/404/ Disallow: */art/*/article/*/acff/clubs/ Disallow: /taxonomy/term/*/acff/clubs/ # Archives Paths #Disallow: /archive/index* Disallow: /archives/recherche* # App Universal Linking App IOS #Allow: /.well-known/ #Allow: /apple-app-site-association # CSS, JS, Images Allow: /misc/*.css$ Allow: /misc/*.css? Allow: /misc/*.js$ Allow: /misc/*.js? Allow: /misc/*.gif Allow: /misc/*.jpg Allow: /misc/*.jpeg Allow: /misc/*.png Allow: /modules/*.css$ Allow: /modules/*.css? Allow: /modules/*.js$ Allow: /modules/*.js? Allow: /modules/*.gif Allow: /modules/*.jpg Allow: /modules/*.jpeg Allow: /modules/*.png Allow: /profiles/*.css$ Allow: /profiles/*.css? Allow: /profiles/*.js$ Allow: /profiles/*.js? Allow: /profiles/*.gif Allow: /profiles/*.jpg Allow: /profiles/*.jpeg Allow: /profiles/*.png Allow: /themes/*.css$ Allow: /themes/*.css? Allow: /themes/*.js$ Allow: /themes/*.js? Allow: /themes/*.gif Allow: /themes/*.jpg Allow: /themes/*.jpeg Allow: /themes/*.png # Not allowed bots User-agent: 5emeRue User-agent: 5erue User-agent: adequat User-agent: adequat-systems User-agent: AmiSoftware User-agent: AwarioRssBot User-agent: AwarioSmartBot User-agent: Argus User-agent: Ask n read User-agent: asknread.com User-agent: Augure User-agent: auramundi User-agent: Bloodhound User-agent: Cision User-agent: coexel User-agent: ConveraCrawler User-agent: Corporama User-agent: cydralspider User-agent: Digimind User-agent: Download Ninja User-agent: downloadexpress User-agent: EDD User-agent: ellisphere User-agent: eureka User-agent: Europresse User-agent: Explore User-agent: Factiva User-agent: Fasterfox User-agent: Fetch User-agent: gammaSpider User-agent: grub-client User-agent: HTTrack User-agent: ia_archiver User-agent: ia_archiver-web.archive.org User-agent: indexer User-agent: infoseek User-agent: Jetbot User-agent: k2spider User-agent: Kantar User-agent: kbcrawl User-agent: Knowings User-agent: larbin User-agent: leadbox User-agent: libwww User-agent: linkfluence User-agent: linko User-agent: manageo User-agent: mediacompil User-agent: Meltwater User-agent: mention User-agent: Moreover User-agent: MSIECrawler User-agent: mytwip User-agent: newscan-online User-agent: NewsNow User-agent: Newzbin User-agent: NPBot User-agent: ObjectsSearch User-agent: Offline Explorer User-agent: opinion-tracker User-agent: Pimptrain User-agent: proxem User-agent: QuepasaCreep User-agent: Qwam content intelligence User-agent: Raven User-agent: readability.com User-agent: scoop.it User-agent: score3 User-agent: Sindup User-agent: sitecheck.internetseer.com User-agent: SiteSnagger User-agent: spotter User-agent: Synthesio User-agent: Talkwater User-agent: Teleport User-agent: TeleportPro User-agent: trendeo User-agent: trendybuzz User-agent: TunitinBot User-agent: TurnitinBot User-agent: up2news User-agent: vecteurplus User-agent: Verif User-agent: verticalsearch User-agent: vsw User-agent: wapspider User-agent: WebCopier User-agent: WebReaper User-agent: WebStripper User-agent: WebZinger User-agent: WebZIP User-agent: Wget User-agent: winello User-agent: Youmag User-agent: Zealbot User-agent: Zite User-agent: ZyBORG Disallow: / # AI Data Scrapers User-agent: AI2Bot User-agent: Amazonbot User-agent: Applebot-Extended User-agent: anthropic-ai User-agent: Bytespider User-agent: CCBot User-agent: ChatGPT-User User-agent: ClaudeBot User-agent: Claude-Web User-agent: cohere-ai User-agent: Diffbot User-agent: DuckAssistBot User-agent: FacebookBot User-agent: Google-Extended User-agent: GPTBot User-agent: Meta-ExternalAgent User-agent: Meta-ExternalFetcher User-agent: OAI-SearchBot User-agent: PerplexityBot Disallow: /