Script to Download Phishtank database for SA

Use this forum for discussions about SpamAssassin and anti-spam in general.
Post Reply
gotspatel
Senior user
Senior user
Posts: 339
Joined: 2013-10-08 05:42
Location: INDIA

Script to Download Phishtank database for SA

Post by gotspatel » 2023-03-23 15:23

Recently came across an SA plugin ifplugin Mail::SpamAssassin::Plugin::Phishing

I have built(modified) a python script to download the csv file for phishing_phishtank_feed which rather is not possible to download by scripting as it requires token to download and they have stopped giving access to their API

SA Plugin options

Code: Select all

ifplugin Mail::SpamAssassin::Plugin::Phishing
        phishing_openphish_feed		/etc/spamassassin/feed.txt
        phishing_phishtank_feed		/etc/spamassassin/verified_online.csv
        phishing_phishstats_feed	/etc/spamassassin/phish_score.csv
        body     URI_PHISHING      	eval:check_phishing()
        describe URI_PHISHING      	Url match phishing in feed
        score URI_PHISHING			2.1
        phishing_phishstats_minscore	5
endif
Other database downloads as below
phishing_openphish_feed "https://openphish.com/feed.txt" --Updated 6 hours in Free
phishing_phishstats_feed "https://phishstats.info/phish_score.csv" --Updated every 90 minutes

The problematic database
phishing_phishtank_feed "http://data.phishtank.com/data/online-valid.csv"

Python Script (required python 2.7 minimum, pip and selenium)

Code: Select all

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import os
import time

# Script downloaded from https://www.onlinetutorialspoint.com/selenium/python-selenium-download-a-file-in-headless-mode.html
# modified as per needs
# 
DOWNLOAD_URL = "https://data.phishtank.com/data/online-valid.csv"
download_dir = "C:\\Program Files\\JAM Software\\SpamAssassin for Windows\\etc\\spamassassin\\"
driver_path = "C:\\Scripts\\SpamAssassin\\Download_phish\\chromedriver.exe"

def enable_download(driver):
    driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')
    params = {'cmd':'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': download_dir}}
    driver.execute("send_command", params)

def setting_chrome_options():
    chrome_options = Options()
    chrome_options.add_argument("--headless")
    chrome_options.add_argument('--no-sandbox')
    chrome_options.add_experimental_option("prefs", {
        "download.prompt_for_download": False,
        "download.directory_upgrade": True,
        "safebrowsing_for_trusted_sources_enabled": False,
        "safebrowsing.enabled": False
    })
    chrome_options.add_argument('--disable-gpu')
    chrome_options.add_argument('--disable-software-rasterizer')
    return chrome_options;

def isFileDownloaded():
    file_path = download_dir+"\\verified_online.csv"
    while not os.path.exists(file_path):
        time.sleep(1)
    if os.path.isfile(file_path):
        print("File Downloaded successfully..")

if __name__ == '__main__':
    driver = webdriver.Chrome(executable_path=driver_path,options=setting_chrome_options())
    enable_download(driver)
    driver.get(DOWNLOAD_URL)
    isFileDownloaded()

The file is updated every hour but it is rate limited to download at every 2 hours only without login through API

palinka
Senior user
Senior user
Posts: 4349
Joined: 2017-09-12 17:57

Re: Script to Download Phishtank database for SA

Post by palinka » 2023-03-23 18:28

What is this and why is it needed?

Code: Select all

driver_path = "C:\\Scripts\\SpamAssassin\\Download_phish\\chromedriver.exe"

gotspatel
Senior user
Senior user
Posts: 339
Joined: 2013-10-08 05:42
Location: INDIA

Re: Script to Download Phishtank database for SA

Post by gotspatel » 2023-03-24 05:53

palinka wrote:
2023-03-23 18:28
What is this and why is it needed?

Code: Select all

driver_path = "C:\\Scripts\\SpamAssassin\\Download_phish\\chromedriver.exe"
This is the Driver for chrome (browser) to run --headless (without any window) can be downloaded from HERE

it is needed so that the link will open in a browser mode, which in turn generate a unique token to download this csv file which otherwise will not download by other scripting methods.

Regards,

gotspatel
Senior user
Senior user
Posts: 339
Joined: 2013-10-08 05:42
Location: INDIA

Re: Script to Download Phishtank database for SA

Post by gotspatel » 2023-03-24 06:33

Just to update

Code: Select all

        phishing_phishstats_feed	/etc/spamassassin/phish_score.csv

Code: Select all

phishing_phishstats_feed "https://phishstats.info/phish_score.csv" --Updated every 90 minutes
Works only on SpamAssassin V 4.0.0

User avatar
RvdH
Senior user
Senior user
Posts: 3088
Joined: 2008-06-27 14:42
Location: The Netherlands

Re: Script to Download Phishtank database for SA

Post by RvdH » 2023-03-24 09:07

@gotspatel

I must be one of the lucky ones, i have a PhishTank login and thus can download using my api key
Your script seems a bit overkill...can't this be done with wget or curl just as easy?
CIDR to RegEx: d-fault.nl/cidrtoregex
DNS Lookup: d-fault.nl/dnstools
DKIM Generator: d-fault.nl/dkimgenerator
DNSBL Lookup: d-fault.nl/dnsbllookup
GEOIP Lookup: d-fault.nl/geoiplookup

gotspatel
Senior user
Senior user
Posts: 339
Joined: 2013-10-08 05:42
Location: INDIA

Re: Script to Download Phishtank database for SA

Post by gotspatel » 2023-03-24 09:28

RvdH wrote:
2023-03-24 09:07
@gotspatel

I must be one of the lucky ones, i have a PhishTank login and thus can download using my api key
Your script seems a bit overkill...can't this be done with wget or curl just as easy?
You sure are the lucky one and we all are also lucky to have you here :mrgreen:

I tried all Powershell, batch script, vbscript, wget and curl, (both on windows and linux) but not working as every time you open the link it generates a new token which is stored in cache, it is being read and if available then only the file is downloaded.


h t t p s://cdn.phishtank.com/datadumps/verified_online.csv? Expires=1679643033&Signature=GyZpNuvHA7Aw9vIVl9Y2RTbEt998FXNc98vkjGR8zhlZoEaz2U~1xKcwgSjxO7KBy7PYRWJvzyI5v2eDIy1KA22tUXpkjutbg2jU437remEF4GTGKuC3iMXoKQ48BwDGgF2NEDoktu1xPVKu55wnqyiu9TtsJWY8HzDALyXOJeCPGLLEqBguqlTNFk-9OLeV6JLbPwvShcWDR1L3x~3OJUoLIPhJoljzXGQxxATQcWp1aeqSpGP5IiXJ8iKb0QQmgXVjyxP12O2FMuttDlhCbmrzlK2JphNb~DCOfuTvMcKXiH5IQuEbJcOrRvophixBJjEYBDGSkAN7goc0bGcuow__&Key-Pair-Id=APKAILB45UG3RB4CSOJA


The important parts highlighted

May be the code can be optimized will try out and post results. also ::TODO:: is logging which I am working on

Is there any option/way where we can get the API login as I tried sending multiple requests by email but no reply yet.

Regards

User avatar
SorenR
Senior user
Senior user
Posts: 6123
Joined: 2006-08-21 15:38
Location: Denmark

Re: Script to Download Phishtank database for SA

Post by SorenR » 2023-03-24 14:01

Just brainstorming here...

If I put this http://data.phishtank.com/data/online-valid.csv in the adderss field of my browser it actually downloads the file...

If I use Wget I get a crap message that it cannot save some weirdly named file...

Curl ... well, I got rate limited :roll:

There is also a json dataset that I can get with

Code: Select all

wget --no-check-certificate http://data.phishtank.com/data/online-valid.json -o file.json
32 bit windows

Code: Select all

Function jsonDecode(jsonString)
    On Error Resume Next
    Dim oSCtrl
    Set oSCtrl = CreateObject(MSScriptControl.ScriptControl)
    oSCtrl.Language = "JScript"
    Set jsonDecode = oSCtrl.Eval("(" & jsonString & ")")
    On Error GoTo 0
End Function
64 bit windows + https://github.com/tablacus/TablacusScriptControl

Code: Select all

Function jsonDecode(jsonString)
    On Error Resume Next
    Dim oSCtrl
    Set oSCtrl = CreateObject(ScriptControl)
    oSCtrl.Language = "JScript"
    Set jsonDecode = oSCtrl.Eval("(" & jsonString & ")")
    On Error GoTo 0
End Function
SørenR.

To understand recursion, you must first understand recursion.

User avatar
SorenR
Senior user
Senior user
Posts: 6123
Joined: 2006-08-21 15:38
Location: Denmark

Re: Script to Download Phishtank database for SA

Post by SorenR » 2023-03-24 14:20

Oh... https://phishtank.org/developer_info.php
We require that you use a descriptive User Agent string in your application to identify the application. If your User Agent is blank or generic, you may recieve an increased number of rate limited requests or be redirected to additional security checks.
SørenR.

To understand recursion, you must first understand recursion.

User avatar
RvdH
Senior user
Senior user
Posts: 3088
Joined: 2008-06-27 14:42
Location: The Netherlands

Re: Script to Download Phishtank database for SA

Post by RvdH » 2023-03-24 14:23

wget has a param for that -U

SA can not work with json dataset
Last edited by RvdH on 2023-03-24 14:27, edited 1 time in total.
CIDR to RegEx: d-fault.nl/cidrtoregex
DNS Lookup: d-fault.nl/dnstools
DKIM Generator: d-fault.nl/dkimgenerator
DNSBL Lookup: d-fault.nl/dnsbllookup
GEOIP Lookup: d-fault.nl/geoiplookup

User avatar
SorenR
Senior user
Senior user
Posts: 6123
Joined: 2006-08-21 15:38
Location: Denmark

Re: Script to Download Phishtank database for SA

Post by SorenR » 2023-03-24 14:25

WTF ... !!!

Code: Select all

C:\WINDOWS\system32>wget -U "phishtank/admin" --no-check-certificate https://data.phishtank.com/data/online-valid.csv
--2023-03-24 13:23:00--  https://data.phishtank.com/data/online-valid.csv
Resolving data.phishtank.com... 104.17.177.85, 104.16.101.75
Connecting to data.phishtank.com|104.17.177.85|:443... connected.
WARNING: cannot verify data.phishtank.com's certificate, issued by `/OU=generated by Avast Antivirus for SSL/TLS scanning/O=Avast Web/Mail Shield/CN=Avast Web/Mail Shield Root':
  Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn.phishtank.com/datadumps/verified_online.csv?Expires=1679660589&Signature=SeEPYsQtJKfCj1pQOOw2dARbM8Fztc7NSw4J5Dbr8PHws0kFX5vt8uILGUY4~nbZiRqx9xdaUtlg6przRK5ks9R3wU8VVEKs6pniOoQ4P87MgWe-oLe8Uj6ihqZ-IluILRcGrzXeGEYg-xIVEuJD5P5gOKJMX5ri7LVT3a7J1xjTLNqwoqksAmmIICj8ZYeP3CfphyhfefHP~pwZanji1Z-gsC4aEVZo2NT085yShj2dNyV9myAD56SnjXpruTtCYTbe3y8O~af3UOfIkWuv3rsCPUdw-6E1GtgdVzlunv6RPiU6mv1jr0Q-WP-NbzCNEbTQgkBTEZZ~qJk0PxxJag__&Key-Pair-Id=APKAILB45UG3RB4CSOJA [following]
--2023-03-24 13:23:00--  https://cdn.phishtank.com/datadumps/verified_online.csv?Expires=1679660589&Signature=SeEPYsQtJKfCj1pQOOw2dARbM8Fztc7NSw4J5Dbr8PHws0kFX5vt8uILGUY4~nbZiRqx9xdaUtlg6przRK5ks9R3wU8VVEKs6pniOoQ4P87MgWe-oLe8Uj6ihqZ-IluILRcGrzXeGEYg-xIVEuJD5P5gOKJMX5ri7LVT3a7J1xjTLNqwoqksAmmIICj8ZYeP3CfphyhfefHP~pwZanji1Z-gsC4aEVZo2NT085yShj2dNyV9myAD56SnjXpruTtCYTbe3y8O~af3UOfIkWuv3rsCPUdw-6E1GtgdVzlunv6RPiU6mv1jr0Q-WP-NbzCNEbTQgkBTEZZ~qJk0PxxJag__&Key-Pair-Id=APKAILB45UG3RB4CSOJA
Resolving cdn.phishtank.com... 104.17.177.85, 104.16.101.75
Connecting to cdn.phishtank.com|104.17.177.85|:443... connected.
WARNING: cannot verify cdn.phishtank.com's certificate, issued by `/OU=generated by Avast Antivirus for SSL/TLS scanning/O=Avast Web/Mail Shield/CN=Avast Web/Mail Shield Root':
  Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/csv]
verified_online.csv@Expires=1679660589&Signature=SeEPYsQtJKfCj1pQOOw2dARbM8Fztc7NSw4J5Dbr8PHws0kFX5vt8uILGUY4~nbZiRqx9xdaUtlg6przRK5ks9R3wU8VVEKs6pniOoQ4P87MgWe-oLe8Uj6ihqZ-IluILRcGrzXeGEYg-xIVEuJD5P5gOKJMX5ri7LVT3a7J1xjTLNqwoqksAmmIICj8ZYeP3CfphyhfefHP~pwZanji1Z-gsC4aEVZo2NT085yShj2dNyV9myAD56SnjXpruTtCYTbe3y8O~af3UOfIkWuv3rsCPUdw-6E1GtgdVzlunv6RPiU6mv1jr0Q-WP-NbzCNEbTQgkBTEZZ~qJk0PxxJag__&Key-Pair-Id=APKAILB45UG3RB4CSOJA: No such file or directory

Cannot write to `verified_online.csv@Expires=1679660589&Signature=SeEPYsQtJKfCj1pQOOw2dARbM8Fztc7NSw4J5Dbr8PHws0kFX5vt8uILGUY4~nbZiRqx9xdaUtlg6przRK5ks9R3wU8VVEKs6pniOoQ4P87MgWe-oLe8Uj6ihqZ-IluILRcGrzXeGEYg-xIVEuJD5P5gOKJMX5ri7LVT3a7J1xjTLNqwoqksAmmIICj8ZYeP3CfphyhfefHP~pwZanji1Z-gsC4aEVZo2NT085yShj2dNyV9myAD56SnjXpruTtCYTbe3y8O~af3UOfIkWuv3rsCPUdw-6E1GtgdVzlunv6RPiU6mv1jr0Q-WP-NbzCNEbTQgkBTEZZ~qJk0PxxJag__&Key-Pair-Id=APKAILB45UG3RB4CSOJA' (Result too large).
SørenR.

To understand recursion, you must first understand recursion.

User avatar
RvdH
Senior user
Senior user
Posts: 3088
Joined: 2008-06-27 14:42
Location: The Netherlands

Re: Script to Download Phishtank database for SA

Post by RvdH » 2023-03-24 14:28

Code: Select all

@echo off
cls

set "phishing_phishtank_feed=https://data.phishtank.com/data/online-valid.csv"
set "phishtank_filename=phishtank-feed.csv"

wget -b -q -N -U "phishtank/downloader" -O "%phishtank_filename%" "%phishing_phishtank_feed%"
This works just fine

Whereas:

Code: Select all

-O,  --output-document=FILE      write documents to FILE
-N,  --timestamping              don't re-retrieve files unless newer than local
-U,  --user-agent=AGENT          identify as AGENT instead of Wget/VERSION
-b,  --background                go to background after startup
-q,  --quiet                     quiet (no output)
lowercase and uppercase params sometimes have different meanings, eg: case sensitive :!:
CIDR to RegEx: d-fault.nl/cidrtoregex
DNS Lookup: d-fault.nl/dnstools
DKIM Generator: d-fault.nl/dkimgenerator
DNSBL Lookup: d-fault.nl/dnsbllookup
GEOIP Lookup: d-fault.nl/geoiplookup

User avatar
SorenR
Senior user
Senior user
Posts: 6123
Joined: 2006-08-21 15:38
Location: Denmark

Re: Script to Download Phishtank database for SA

Post by SorenR » 2023-03-24 14:43

RvdH wrote:
2023-03-24 14:28

Code: Select all

@echo off
cls

set "phishing_phishtank_feed=https://data.phishtank.com/data/online-valid.csv"
set "phishtank_filename=phishtank-feed.csv"

wget -b -q -N -U "phishtank/downloader" -O "%phishtank_filename%" "%phishing_phishtank_feed%"
This works just fine

Whereas:

Code: Select all

-O,  --output-document=FILE      write documents to FILE
-N,  --timestamping              don't re-retrieve files unless newer than local
-U,  --user-agent=AGENT          identify as AGENT instead of Wget/VERSION
-b,  --background                go to background after startup
-q,  --quiet                     quiet (no output)
lowercase and uppercase params sometimes have different meanings, eg: case sensitive :!:
I think I'm rate limited 'cause I'm not getting anything... Will try again in an hour or so.
SørenR.

To understand recursion, you must first understand recursion.

User avatar
RvdH
Senior user
Senior user
Posts: 3088
Joined: 2008-06-27 14:42
Location: The Netherlands

Re: Script to Download Phishtank database for SA

Post by RvdH » 2023-03-24 14:47

with Api, eg: registered

-U "phishtank/[username]"
https://data.phishtank.com/data/[api-key]/online-valid.csv

Code: Select all

@echo off
cls

set "phishtank_filename=phishtank-feed.csv"
set "phishtank_username={MYUSERNAME}"
set "phishtank_apikey={MYAPIKEY}"
set "phishtank_feed=https://data.phishtank.com/data/%phishtank_apikey%/online-valid.csv"

wget -b -q -N -U "phishtank/%phishtank_username%" -O "%phishtank_filename%" "%phishtank_feed%"
CIDR to RegEx: d-fault.nl/cidrtoregex
DNS Lookup: d-fault.nl/dnstools
DKIM Generator: d-fault.nl/dkimgenerator
DNSBL Lookup: d-fault.nl/dnsbllookup
GEOIP Lookup: d-fault.nl/geoiplookup

User avatar
RvdH
Senior user
Senior user
Posts: 3088
Joined: 2008-06-27 14:42
Location: The Netherlands

Re: Script to Download Phishtank database for SA

Post by RvdH » 2023-03-24 15:03

All in One, place wget.exe into C:\Program Files (x86)\SpamAssassin for Windows\etc\spamassassin\

C:\Program Files (x86)\SpamAssassin for Windows\etc\spamassassin\Phishing.cmd (to run with Task Scheduler)

Code: Select all

@echo off
cls

set "phishtank_filename=phishtank-feed.csv"
set "phishtank_feed=https://data.phishtank.com/data/online-valid.csv"
wget -b -q -N -U "phishtank/downloader" -O "%phishtank_filename%" "%phishtank_feed%"

set "openphish_filename=openphish-feed.txt"
set "openphish_feed=https://openphish.com/feed.txt"

wget -b -q -N -U "openphish/downloader" -O "%openphish_filename%" "%openphish_feed%"

set "phishstats_filename=phishstats-feed.csv"
set "phishstats_feed=https://phishstats.info/phish_score.csv"

wget -b -q -N -U "phishstats/downloader" -O "%phishstats_filename%" "%phishstats_feed%"

C:\Program Files (x86)\SpamAssassin for Windows\etc\spamassassin\Phishing.cf

Code: Select all

if (version >= 3.004002)

    ifplugin Mail::SpamAssassin::Plugin::Phishing

        phishing_openphish_feed C:\Program Files (x86)\SpamAssassin for Windows\etc\spamassassin\openphish-feed.txt
        phishing_phishtank_feed C:\Program Files (x86)\SpamAssassin for Windows\etc\spamassassin\phishtank-feed.csv
        phishing_phishstats_feed C:\Program Files (x86)\SpamAssassin for Windows\etc\spamassassin\phishstats-feed.csv

        #phishing_phishstats_minscore 5	
        #phishing_uri_noparam 1

        body		URI_PHISHING	eval:check_phishing()
        describe	URI_PHISHING	Url matches phishing feed
        score		URI_PHISHING	5.0

    endif

endif
CIDR to RegEx: d-fault.nl/cidrtoregex
DNS Lookup: d-fault.nl/dnstools
DKIM Generator: d-fault.nl/dkimgenerator
DNSBL Lookup: d-fault.nl/dnsbllookup
GEOIP Lookup: d-fault.nl/geoiplookup

User avatar
RvdH
Senior user
Senior user
Posts: 3088
Joined: 2008-06-27 14:42
Location: The Netherlands

Re: Script to Download Phishtank database for SA

Post by RvdH » 2023-03-24 15:44

gotspatel wrote:
2023-03-23 15:23

phishing_openphish_feed /etc/spamassassin/feed.txt
phishing_phishtank_feed /etc/spamassassin/verified_online.csv
phishing_phishstats_feed /etc/spamassassin/phish_score.csv
Do relative paths work? docs say use absolute path (4.0.x), see:
https://spamassassin.apache.org/full/4. ... shing.html
CIDR to RegEx: d-fault.nl/cidrtoregex
DNS Lookup: d-fault.nl/dnstools
DKIM Generator: d-fault.nl/dkimgenerator
DNSBL Lookup: d-fault.nl/dnsbllookup
GEOIP Lookup: d-fault.nl/geoiplookup

User avatar
RvdH
Senior user
Senior user
Posts: 3088
Joined: 2008-06-27 14:42
Location: The Netherlands

Re: Script to Download Phishtank database for SA

Post by RvdH » 2023-03-25 13:08

Mmm, damn....turns out -N -O {newfilename} params do not work together

Code: Select all

WARNING: timestamping does nothing in combination with -O. See the manual
for details.
CIDR to RegEx: d-fault.nl/cidrtoregex
DNS Lookup: d-fault.nl/dnstools
DKIM Generator: d-fault.nl/dkimgenerator
DNSBL Lookup: d-fault.nl/dnsbllookup
GEOIP Lookup: d-fault.nl/geoiplookup

gotspatel
Senior user
Senior user
Posts: 339
Joined: 2013-10-08 05:42
Location: INDIA

Re: Script to Download Phishtank database for SA

Post by gotspatel » 2023-04-08 08:37

RvdH wrote:
2023-03-24 15:03
All in One, place wget.exe into C:\Program Files (x86)\SpamAssassin for Windows\etc\spamassassin\

C:\Program Files (x86)\SpamAssassin for Windows\etc\spamassassin\Phishing.cmd (to run with Task Scheduler)

Code: Select all

@echo off
cls

set "phishtank_filename=phishtank-feed.csv"
set "phishtank_feed=https://data.phishtank.com/data/online-valid.csv"
wget -b -q -N -U "phishtank/downloader" -O "%phishtank_filename%" "%phishtank_feed%"

set "openphish_filename=openphish-feed.txt"
set "openphish_feed=https://openphish.com/feed.txt"

wget -b -q -N -U "openphish/downloader" -O "%openphish_filename%" "%openphish_feed%"

set "phishstats_filename=phishstats-feed.csv"
set "phishstats_feed=https://phishstats.info/phish_score.csv"

wget -b -q -N -U "phishstats/downloader" -O "%phishstats_filename%" "%phishstats_feed%"

C:\Program Files (x86)\SpamAssassin for Windows\etc\spamassassin\Phishing.cf

Code: Select all

if (version >= 3.004002)

    ifplugin Mail::SpamAssassin::Plugin::Phishing

        phishing_openphish_feed C:\Program Files (x86)\SpamAssassin for Windows\etc\spamassassin\openphish-feed.txt
        phishing_phishtank_feed C:\Program Files (x86)\SpamAssassin for Windows\etc\spamassassin\phishtank-feed.csv
        phishing_phishstats_feed C:\Program Files (x86)\SpamAssassin for Windows\etc\spamassassin\phishstats-feed.csv

        #phishing_phishstats_minscore 5	
        #phishing_uri_noparam 1

        body		URI_PHISHING	eval:check_phishing()
        describe	URI_PHISHING	Url matches phishing feed
        score		URI_PHISHING	5.0

    endif

endif
Just an update to this script. (as it needs to wait for downloading to complete ) when running form task scheduler

Code: Select all

@echo off
cls
REM wget.exe needs to be in the script folder
set "phishtank_filename=C:\Program Files\JAM Software\SpamAssassin for Windows\etc\spamassassin\phishtank-feed.csv"
set "phishtank_feed=https://data.phishtank.com/data/online-valid.csv"
wget -b -q -U "phishtank/downloader" -O "%phishtank_filename%" "%phishtank_feed%"

set "openphish_filename=C:\Program Files\JAM Software\SpamAssassin for Windows\etc\spamassassin\openphish-feed.txt"
set "openphish_feed=https://openphish.com/feed.txt"

wget -b -q -U "openphish/downloader" -O "%openphish_filename%" "%openphish_feed%"

set "phishstats_filename=C:\Program Files\JAM Software\SpamAssassin for Windows\etc\spamassassin\phishstats-feed.csv"
set "phishstats_feed=https://phishstats.info/phish_score.csv"

wget -b -q -U "phishstats/downloader" -O "%phishstats_filename%" "%phishstats_feed%"

ping -n 10 127.0.0.1 >nul


palinka
Senior user
Senior user
Posts: 4349
Joined: 2017-09-12 17:57

Re: Script to Download Phishtank database for SA

Post by palinka » 2023-04-08 11:39

gotspatel wrote:
2023-04-08 08:37

Code: Select all

REM wget.exe needs to be in the script folder
You can put it anywhere and place it in the system path. That might be more useful if you want to use it for more than this particular script.

palinka
Senior user
Senior user
Posts: 4349
Joined: 2017-09-12 17:57

Re: Script to Download Phishtank database for SA

Post by palinka » 2023-04-08 12:13

You could also make these one liners in your schedule task actions and forget about the script altogether.

Code: Select all

wget -b -q -U "phishtank/downloader" -O "C:\SpamAssassin\etc\spamassassin\phishtank-feed.csv" "https://data.phishtank.com/data/online-valid.csv"
wget -b -q -U "openphish/downloader" -O "C:\SpamAssassin\etc\spamassassin\openphish-feed.txt" "https://openphish.com/feed.txt"
wget -b -q -U "phishstats/downloader" -O "C:\SpamAssassin\etc\spamassassin\phishstats-feed.csv" "https://phishstats.info/phish_score.csv"

gotspatel
Senior user
Senior user
Posts: 339
Joined: 2013-10-08 05:42
Location: INDIA

Re: Script to Download Phishtank database for SA

Post by gotspatel » 2023-04-08 12:24

palinka wrote:
2023-04-08 11:39
gotspatel wrote:
2023-04-08 08:37

Code: Select all

REM wget.exe needs to be in the script folder
You can put it anywhere and place it in the system path. That might be more useful if you want to use it for more than this particular script.
Correct :D

Done, in fact just installed UNIXUTILS and added in system path for more unix commands
Last edited by gotspatel on 2023-04-08 12:28, edited 1 time in total.

gotspatel
Senior user
Senior user
Posts: 339
Joined: 2013-10-08 05:42
Location: INDIA

Re: Script to Download Phishtank database for SA

Post by gotspatel » 2023-04-08 12:25

palinka wrote:
2023-04-08 12:13
You could also make these one liners in your schedule task actions and forget about the script altogether.

Code: Select all

wget -b -q -U "phishtank/downloader" -O "C:\SpamAssassin\etc\spamassassin\phishtank-feed.csv" "https://data.phishtank.com/data/online-valid.csv"
wget -b -q -U "openphish/downloader" -O "C:\SpamAssassin\etc\spamassassin\openphish-feed.txt" "https://openphish.com/feed.txt"
wget -b -q -U "phishstats/downloader" -O "C:\SpamAssassin\etc\spamassassin\phishstats-feed.csv" "https://phishstats.info/phish_score.csv"
I don't know why but the Phishtank feed sometimes need a wait elase the file is written with 0 bytes :shock:

hence "ping -n 10 127.0.0.1 >nul"

Regards

palinka
Senior user
Senior user
Posts: 4349
Joined: 2017-09-12 17:57

Re: Script to Download Phishtank database for SA

Post by palinka » 2023-04-08 12:31

gotspatel wrote:
2023-04-08 12:25
palinka wrote:
2023-04-08 12:13
You could also make these one liners in your schedule task actions and forget about the script altogether.

Code: Select all

wget -b -q -U "phishtank/downloader" -O "C:\SpamAssassin\etc\spamassassin\phishtank-feed.csv" "https://data.phishtank.com/data/online-valid.csv"
wget -b -q -U "openphish/downloader" -O "C:\SpamAssassin\etc\spamassassin\openphish-feed.txt" "https://openphish.com/feed.txt"
wget -b -q -U "phishstats/downloader" -O "C:\SpamAssassin\etc\spamassassin\phishstats-feed.csv" "https://phishstats.info/phish_score.csv"
I don't know why but the Phishtank feed sometimes need a wait elase the file is written with 0 bytes :shock:

hence "ping -n 10 127.0.0.1 >nul"

Regards
Get rid of "-b" and you will be able to pull an exit code. As soon as it goes to background - which is ALWAYS successful - the exit code will always be 0 for success. Removing the background command will force it to complete with an exit code which you can use to find the error.

Allowing it to complete before exiting the task might just solve your issue in the first place.
EXIT STATUS
Wget may return one of several error codes if it encounters problems.

0 No problems occurred.
1 Generic error code.
2 Parse error---for instance, when parsing command-line options, the .wgetrc or .netrc...
3 File I/O error.
4 Network failure.
5 SSL verification failure.
6 Username/password authentication failure.
7 Protocol errors.
8 Server issued an error response.

With the exceptions of 0 and 1, the lower-numbered exit codes take precedence over higher-numbered ones, when multiple types of errors are encountered.

In versions of Wget prior to 1.12, Wget’s exit status tended to be unhelpful and inconsistent. Recursive downloads would virtually always return 0 (success), regardless of any issues encountered, and non-recursive fetches only returned the status corresponding to the most recently-attempted download.
Right now I'm working on a powershell function to handle this so I can add it to my nightly backup script. :D

gotspatel
Senior user
Senior user
Posts: 339
Joined: 2013-10-08 05:42
Location: INDIA

Re: Script to Download Phishtank database for SA

Post by gotspatel » 2023-04-08 12:56

palinka wrote:
2023-04-08 12:31
add it to my nightly backup script. :D
:mrgreen: See I was just motivating you to update the backup script.

waiting soon an update. :D

palinka
Senior user
Senior user
Posts: 4349
Joined: 2017-09-12 17:57

Re: Script to Download Phishtank database for SA

Post by palinka » 2023-04-08 13:39

gotspatel wrote:
2023-04-08 12:56
palinka wrote:
2023-04-08 12:31
add it to my nightly backup script. :D
:mrgreen: See I was just motivating you to update the backup script.

waiting soon an update. :D
OK, I just finished writing it. Give me a few days to test it. Requires wget in the system path.

I ran into the 0 byte file issue while testing on fake urls (in order to generate an error code). You might want to remove the -b and try to figure out what's going on. I might run into the same issue. I'm thinking about how to deal with that. The problem is the file is created first, then the download attempt is made. So maybe I could download to a temp file, and if the download was successful I would overwrite the local file. That way, if it fails, you don't end up with a 0 byte file that does nothing (or potentially worse) for your SA operation.

User avatar
RvdH
Senior user
Senior user
Posts: 3088
Joined: 2008-06-27 14:42
Location: The Netherlands

Re: Script to Download Phishtank database for SA

Post by RvdH » 2023-07-26 17:34

phishstats.info is no more :(
CIDR to RegEx: d-fault.nl/cidrtoregex
DNS Lookup: d-fault.nl/dnstools
DKIM Generator: d-fault.nl/dkimgenerator
DNSBL Lookup: d-fault.nl/dnsbllookup
GEOIP Lookup: d-fault.nl/geoiplookup

palinka
Senior user
Senior user
Posts: 4349
Joined: 2017-09-12 17:57

Re: Script to Download Phishtank database for SA

Post by palinka » 2023-07-26 22:04

RvdH wrote:
2023-07-26 17:34
phishstats.info is no more :(
Are you sure? It would not download twice last week, but it worked for me last night.

User avatar
RvdH
Senior user
Senior user
Posts: 3088
Joined: 2008-06-27 14:42
Location: The Netherlands

Re: Script to Download Phishtank database for SA

Post by RvdH » 2023-07-26 22:14

i get ERR_NAME_NOT_RESOLVED
And last it downloaded was some parking website html

https://github.com/apache/spamassassin/ ... 73a2447730
CIDR to RegEx: d-fault.nl/cidrtoregex
DNS Lookup: d-fault.nl/dnstools
DKIM Generator: d-fault.nl/dkimgenerator
DNSBL Lookup: d-fault.nl/dnsbllookup
GEOIP Lookup: d-fault.nl/geoiplookup

palinka
Senior user
Senior user
Posts: 4349
Joined: 2017-09-12 17:57

Re: Script to Download Phishtank database for SA

Post by palinka » 2023-07-26 23:15

Code: Select all

Non-authoritative answer:
Name:    77980.bodis.com
Address:  199.59.243.224
Aliases:  phishstats.info
Son of a bitch... Here's what I've been "downloading successfully".

Code: Select all

<!doctype html>
<html data-adblockkey="MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBANDrp2lz7AOmADaN8tA50LsWcjLFyQFcb/P2Txc58oYOeILb3vBw7J6f4pamkAQVSQuqYsKx3YzdUHCvbVZvFUsCAwEAAQ==_jF6xGMSgrVgdK+tS0Y4DG/GMdRPEnUa/3ObF/ExZ3QLo26EumVtBPmVQTXHwQVz0zE0Mr9FLfq1vgtZUtxvAuA==" lang="en">
<head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <link rel="icon" href="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVQI12P4//8/AAX+Av7czFnnAAAAAElFTkSuQmCC">
    <link rel="preconnect" href="https://www.google.com" crossorigin>
</head>
<body>
<div id="target" style="opacity: 0"></div>
<script>window.park = "eyJ1dWlkIjoiOTc3NTRmNzgtZDc4OC00N2U5LWI2OTItZTAyZTQ3YTZkYWEwIiwicGFnZV90aW1lIjoxNjkwMzQzOTIzLCJwYWdlX3VybCI6Imh0dHBzOi8vcGhpc2hzdGF0cy5pbmZvL3BoaXNoX3Njb3JlLmNzdiIsInBhZ2VfbWV0aG9kIjoiR0VUIiwicGFnZV9yZXF1ZXN0Ijp7fSwicGFnZV9oZWFkZXJzIjp7InJlZmVyZXIiOlsiIl19LCJob3N0IjoicGhpc2hzdGF0cy5pbmZvIiwiaXAiOiIyMDUuMTg1LjEyMy4xMTgifQo=";</script>
<script src="/js/parking.2.106.5.js"></script>
</body>
</html>

User avatar
RvdH
Senior user
Senior user
Posts: 3088
Joined: 2008-06-27 14:42
Location: The Netherlands

Re: Script to Download Phishtank database for SA

Post by RvdH » 2023-07-27 09:34

ERR_NAME_NOT_RESOLVED was caused by AdGuard Home on my pi :lol:
CIDR to RegEx: d-fault.nl/cidrtoregex
DNS Lookup: d-fault.nl/dnstools
DKIM Generator: d-fault.nl/dkimgenerator
DNSBL Lookup: d-fault.nl/dnsbllookup
GEOIP Lookup: d-fault.nl/geoiplookup

palinka
Senior user
Senior user
Posts: 4349
Joined: 2017-09-12 17:57

Re: Script to Download Phishtank database for SA

Post by palinka » 2023-09-23 20:00

Looks like phishstats is back.

https://phishstats.info/

Did SA ever put it back into phishing.pm?

palinka
Senior user
Senior user
Posts: 4349
Joined: 2017-09-12 17:57

Re: Script to Download Phishtank database for SA

Post by palinka » 2023-09-24 10:16

Looks like I spoke too soon. Sort of. You still cannot download the csv, but you can query the database. However, they both limit max results per query and rate limit queries, so its basically impossible to get the entire database.

Anyway, it still looks dead. The last db entry is from Aug 8.

https://phishstats.info:2096/api/phishing?_sort=-date

Post Reply