Another Regular Expression Question
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Another Regular Expression Question
Sorry, ghuys, I have spent hours trying to work this out by searching the forums andTRYING to understand the documentation about creating Regex expressions but it remains complete hieroglyphic to me. So forgive me for blatently (and himbly) requesting you help on this.
I am trying to create a rule using the REGEX test that simply says:
IF field ([contains WORDA] or [contains WORDB] or [contains WORDC]) AND ([contains STRING1] or [contains STRING2]) then...
eg
if subject (contains 'itune' or 'apple' or 'paypal') AND (contains 'suspend' or 'restrict')
This above example would match:
itunes account will be suspended - ('itunes' and 'suspend' exists together)
will restrict your paypal account - ('restrict' and 'paypal' exists together)
but would not match:
itunes will terminate your account - ('itunes' exists but not 'suspend' or 'restrict')
Im sure its easy-peasy......if you can understand the foreign language that is REGEX. (Something to do with searches within 'group1' and then also a search within 'group2'. I have no idea and have tried many permutations and supposed online regex generators).
I would appreciate all help offered.
Thanks
Jim.
I am trying to create a rule using the REGEX test that simply says:
IF field ([contains WORDA] or [contains WORDB] or [contains WORDC]) AND ([contains STRING1] or [contains STRING2]) then...
eg
if subject (contains 'itune' or 'apple' or 'paypal') AND (contains 'suspend' or 'restrict')
This above example would match:
itunes account will be suspended - ('itunes' and 'suspend' exists together)
will restrict your paypal account - ('restrict' and 'paypal' exists together)
but would not match:
itunes will terminate your account - ('itunes' exists but not 'suspend' or 'restrict')
Im sure its easy-peasy......if you can understand the foreign language that is REGEX. (Something to do with searches within 'group1' and then also a search within 'group2'. I have no idea and have tried many permutations and supposed online regex generators).
I would appreciate all help offered.
Thanks
Jim.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Ok, I think Im on my way:
(.*([Aa]pple|[Ii][Tt]unes|[Pp]aypal).*)(.*(suspend|restrict).*) but still not quite right as it only works if the words are in order as listed.
ie,
Paypal will be restricted - WORKS
will restrict your Paypal - doesnt work as in wrong order.
Cant work out how to apply if ANY of the subsection words apply in ANY order.
Can anyone help?
(.*([Aa]pple|[Ii][Tt]unes|[Pp]aypal).*)(.*(suspend|restrict).*) but still not quite right as it only works if the words are in order as listed.
ie,
Paypal will be restricted - WORKS
will restrict your Paypal - doesnt work as in wrong order.
Cant work out how to apply if ANY of the subsection words apply in ANY order.
Can anyone help?
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
make life easy and use two conditions utilising the rules AND so you get
when creating rule select "use AND" (which is the default)
first condition regex
(?i:^.*(itune|apple|paypal).*$)
second condition regex ( which is a logical AND because you selected it above)
(?i:^.*(suspend|restrict).*$)
Note above is case insensitive and this isn't fully tested but appears to work in hmail rules setup. It may or may not work in javascript (probably not) or VBScript (possibly not but you can force lowercase before doing the regex).
much simpler than you thought (famous last words).
The problem with regex is that you have to know it all and just when you get to the point where think you do, you find out you don't and its back to square one.
when creating rule select "use AND" (which is the default)
first condition regex
(?i:^.*(itune|apple|paypal).*$)
second condition regex ( which is a logical AND because you selected it above)
(?i:^.*(suspend|restrict).*$)
Note above is case insensitive and this isn't fully tested but appears to work in hmail rules setup. It may or may not work in javascript (probably not) or VBScript (possibly not but you can force lowercase before doing the regex).
much simpler than you thought (famous last words).
The problem with regex is that you have to know it all and just when you get to the point where think you do, you find out you don't and its back to square one.
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Hi Percepts
Yes, I will have to do that for now. But its not ideal as I want to run the same test against both the SUBJECT and the BODY. If I know the proper regex way to acheive the 'both' test in one command then my rule could be
where BODY = [regex] OR where SUBJECT = [regex]. As the test now has to be where FIELD = REGEX1 AND REGEX2 I cant do the OR for the separate fields (therefore needing the 2 rules instead of one).
So if the regex way does come to light...... it would be appreciated.
Cheers
Yes, I will have to do that for now. But its not ideal as I want to run the same test against both the SUBJECT and the BODY. If I know the proper regex way to acheive the 'both' test in one command then my rule could be
where BODY = [regex] OR where SUBJECT = [regex]. As the test now has to be where FIELD = REGEX1 AND REGEX2 I cant do the OR for the separate fields (therefore needing the 2 rules instead of one).
So if the regex way does come to light...... it would be appreciated.
Cheers
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
it won't come to light unless you do it yourself. There is no AND operator in regex and you are trying to use it like a programming language. It isn't, its a pattern matching tool and making it do what you want would make it very long winded to get all the logic into it.
Write yourself some VBScript if you need that kind of logic.
Write yourself some VBScript if you need that kind of logic.
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
ah. There is no 'AND' operator. Well that is that then. So 2 rules it is then.
Cheers.
Cheers.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
/(.*(Apple|iTunes|Paypal).*)(.*(suspend|restrict).*)/igjimimaseye wrote:Ok, I think Im on my way:
(.*([Aa]pple|[Ii][Tt]unes|[Pp]aypal).*)(.*(suspend|restrict).*) but still not quite right as it only works if the words are in order as listed.
ie,
Paypal will be restricted - WORKS
will restrict your Paypal - doesnt work as in wrong order.
Cant work out how to apply if ANY of the subsection words apply in ANY order.
Can anyone help?
i = case insensitive
g = global search
Will match "itunes account will be suspended"
Playing with the online regex builder here -> http://regexr.com/
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
UPDATE:
The way to get it into the one rule:
subj ...regex... (paypal|itunes|apple).*(suspend|teminate)
OR
body ...regex... (paypal|itunes|apple).*(suspend|teminate)
OR
subj ...regex... (suspend|teminate).*(paypal|itunes|apple)
OR
body ...regex... (suspend|teminate).*(paypal|itunes|apple)
That should do it. That way its testing for either order or the words and in either of the fields.
In case anyoine is interested in my formula (to take on this endless Palpal/Apple ID/iTunes phisihing spam malarky):
(?i:^.*(itune|apple|paypal|account).*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit).*$)
and the reverse:
(?i:^.*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit).*(itune|apple|paypal|account).*$)
Obviously this will be subject to modification of the terms as new spam terminlogies get sent out and may not suite everyones situation (I am confident it suits my account).
The way to get it into the one rule:
subj ...regex... (paypal|itunes|apple).*(suspend|teminate)
OR
body ...regex... (paypal|itunes|apple).*(suspend|teminate)
OR
subj ...regex... (suspend|teminate).*(paypal|itunes|apple)
OR
body ...regex... (suspend|teminate).*(paypal|itunes|apple)
That should do it. That way its testing for either order or the words and in either of the fields.
In case anyoine is interested in my formula (to take on this endless Palpal/Apple ID/iTunes phisihing spam malarky):
(?i:^.*(itune|apple|paypal|account).*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit).*$)
and the reverse:
(?i:^.*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit).*(itune|apple|paypal|account).*$)
Obviously this will be subject to modification of the terms as new spam terminlogies get sent out and may not suite everyones situation (I am confident it suits my account).
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
Thats what I meant by long winded. It can be done in one regex but it requires several ORs and swapping stuff round.
There are more commands which look ahead that effectively are like an AND but its gets completely unreadable if you have a memory like a sieve(me) and rarely use regex.
I just use spamassassin and auto update my rules once a day. Much easier and I don't have to keep modifying regex patterns. SA do it for you.
There are more commands which look ahead that effectively are like an AND but its gets completely unreadable if you have a memory like a sieve(me) and rarely use regex.
I just use spamassassin and auto update my rules once a day. Much easier and I don't have to keep modifying regex patterns. SA do it for you.
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
I also have spamassassin with daily updates but it doesnt make any impact on these kind of emails at all. They all seem to slip through as new sources and methods. That is why I am creating this rule. (Very occasionally, one might get recognised and scored high enough but maybe only 1 in 7 - the rest slip through.)
I didnt think that was too long-winded really, it was just a case of inserting the new word into the list and then copy the line 3 times.
I didnt think that was too long-winded really, it was just a case of inserting the new word into the list and then copy the line 3 times.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
so what was all the fuss about... amazing what you can do if you try
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Well, the first was that initially I couldn't get anything to work as I simply didn't understand how to start the query and secondly I wasn't aware that there was no AND operator. And actually it was your reply that helped me on my way. Now, with your help, I have my answer. Ta.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Hi Percepts
I was trying to make sense of the regex expression so was matching it to some online 'help' docs. I wondered about the starting ^ and the ending $ symbols. If I remove them and do a test match then the expression still seems to work. So I was wondering if i am misunderstanding something that you know better and whether I should leave them in as you stated? (I under the remaining parts of the expression although not sure of the formatting of the i: or the reliance of ?) Could you explain the ^ and $ please?
Here is my current expression (without the above symbols):
I was trying to make sense of the regex expression so was matching it to some online 'help' docs. I wondered about the starting ^ and the ending $ symbols. If I remove them and do a test match then the expression still seems to work. So I was wondering if i am misunderstanding something that you know better and whether I should leave them in as you stated? (I under the remaining parts of the expression although not sure of the formatting of the i: or the reliance of ?) Could you explain the ^ and $ please?
Here is my current expression (without the above symbols):
Code: Select all
(?i:.*(itune|apple|paypal|account).*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit|dear user|dear customer).*)
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
In many cases they aren't required but I always put them in for clarity and to remove uncertainty, especially when you come back and look at your regex 6 months later and have forgotten exacly how regex works or what your intention was at the time you wrote it.
And if you have an OR where you want to go back and look from start of line then ^ will be required and using $ makes it clear that this section of the regex is to end of line.
And bear in mind that there are different implementations of regex and how it works. Depends on the Regex implementation that was used for compiling hmailserver and hmailadmin as opposed to for example the implementation in PHP for webadmin. And unix and perl implemamtations may be different.
And that means you have to be very careful about where you reading up on regex usage because if its talking about unix or perl what you're reading may not apply to windows version of javscript for example or what hmail uses.
So beware of online javascript based testing tools, especially if you don't know whether the website is running on windows or unix and whther the page uses PHP or ASP or some other webscript language. Always test it on your target platform and software.
And if you have an OR where you want to go back and look from start of line then ^ will be required and using $ makes it clear that this section of the regex is to end of line.
And bear in mind that there are different implementations of regex and how it works. Depends on the Regex implementation that was used for compiling hmailserver and hmailadmin as opposed to for example the implementation in PHP for webadmin. And unix and perl implemamtations may be different.
And that means you have to be very careful about where you reading up on regex usage because if its talking about unix or perl what you're reading may not apply to windows version of javscript for example or what hmail uses.
So beware of online javascript based testing tools, especially if you don't know whether the website is running on windows or unix and whther the page uses PHP or ASP or some other webscript language. Always test it on your target platform and software.
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Ok, well for sure I understand the whole complications of ensuring I am looking at the correct version of regex! I have seen several conflicting opinions/versions of how to do things and i reckon this definitely added to my overall comfusion at the beginning.
Regarding the 2 symbols: ok, well without them it seems to work. But for clarity I will ensure they are included, and then in the future should I ever come to dabble in doing another regex expression and use this as a starting point, they may well be the difference to making my next one work or not.
Cheers.
Regarding the 2 symbols: ok, well without them it seems to work. But for clarity I will ensure they are included, and then in the future should I ever come to dabble in doing another regex expression and use this as a starting point, they may well be the difference to making my next one work or not.
Cheers.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
you don't even need the .* at start and end if you remove the ^ and $ so I think you have a way to go yet.
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
I just tried that and it didnt seem to work. (removing those symbols and the .* at beginning and end resulted in a NO match - so will leave things as is.) It seems the .* is required at both ends.
test:
(?i:(itune|apple|paypal|account).*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit|dear user|dear customer))
against
"your itunes is suspended now"
test:
(?i:(itune|apple|paypal|account).*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit|dear user|dear customer))
against
"your itunes is suspended now"
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
well there you are, it works in my javascript tester but not in hmail which only goes to prove I should listen to my own advice and test in target OS and Software.
It could be that hmail assumes ^ and $ or it could be the implementation of regex that is being used, I don't know.
?i: deosn't work in my javascript tester but I knew that.
It could be that hmail assumes ^ and $ or it could be the implementation of regex that is being used, I don't know.
?i: deosn't work in my javascript tester but I knew that.
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Here you go, Percepts, one of the phishing emails (that I aim to be capturing with my regex formula) has come in and been caught. I post the headers here and you can see how it completely slips through spamassassin (the last header is a custom header added by my rule so I can see why it ends up in the trash - rule or user)
And here is another that just come in:
Note that this 2nd one doesnt have the TVD_PH_BODY_ACCOUNTS_PRE rule catching it or anything similar.
Code: Select all
Return-Path: paypal@support.com
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on decerver
X-Spam-Flag: YES
X-Spam-Level: ***
X-Spam-Status: Yes, score=3.6 required=3.0 tests=HTML_MESSAGE,MIME_HTML_ONLY,
SHORTENED_URL_SRC,TVD_PH_BODY_ACCOUNTS_PRE,T_RP_MATCHES_RCVD autolearn=no
version=3.3.2
X-Spam-Report:
* -0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay
* domain
* 0.0 HTML_MESSAGE BODY: HTML included in message
* 1.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
* 1.0 SHORTENED_URL_SRC RAW: SHORTENED_URL_SRC
* 1.5 TVD_PH_BODY_ACCOUNTS_PRE The body matches phrases such as "accounts
* suspended", "account credited", "account verification"
*
X-hMailServer-ExternalAccount: POPdaily
Return-Path: <www-data@ne302.plesk.neen.it>
Received: from mailin2.hostvue.com (mailin2.hostvue.com [195.26.90.112]) (authenticated
user=user@mydomain.co.uk bits=0) by ms4.hostvue.com (Cyrus v2.4.12-Kolab-2.4.12-1.el6)
with LMTPSA (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256/256
verify=YES); Thu, 26 Jun 2014 15:56:27 +0100
X-Sieve: CMU Sieve 2.4
Received: from ne302.plesk.neen.it ([85.159.145.120]) by mailin2.hostvue.com with
esmtps (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from <www-data@ne302.plesk.neen.it>)
id 1X0B6B-00081U-9x for user@mydomain.co.uk; Thu, 26 Jun 2014 15:56:27
+0100
Received: by ne302.plesk.neen.it (Postfix, from userid 33) id 794D825824D; Thu, 26 Jun
2014 16:56:26 +0200 (CEST)
To: user@mydomain.co.uk
Subject: [SPAM] [3.6] Your account has been suspended !
X-PHP-Originating-Script: 33:1.php
From: PayPal <paypal@support.com>
Reply-To: paypal@support.com
MIME-Version: 1.0
Content-Type: text/html
Message-Id: <20140626145626.794D825824D@ne302.plesk.neen.it>
Date: Thu, 26 Jun 2014 16:56:26 +0200 (CEST)
Content-Transfer-Encoding: quoted-printable
X-Spam-Prev-Subject: Your account has been suspended !
X-hMailServer-Spam: YES
X-hMailServer-Reason-1: Tagged as Spam by SpamAssassin - (Score: 5)
X-hMailServer-Reason-Score: 5
hMailRule_DeletePhishSpam: Yes
Code: Select all
Return-Path: paypal-security@paypal.com
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on decerver
X-Spam-Flag: YES
X-Spam-Level: ******
X-Spam-Status: Yes, score=6.3 required=3.0 tests=DKIM_ADSP_DISCARD, HTML_IMAGE_ONLY_04,HTML_MESSAGE,HTML_SHORT_LINK_IMG_1,MIME_HTML_ONLY,
RCVD_VIA_APNIC,URI_NOVOWEL,XM_PHPMAILER_FORGED autolearn=no version=3.3.2
X-Spam-Report:
* 1.8 DKIM_ADSP_DISCARD No valid author signature, domain signs all mail
* and suggests discarding the rest
* 0.5 URI_NOVOWEL URI: URI hostname has long non-vowel sequence
* 0.0 HTML_MESSAGE BODY: HTML included in message
* 1.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
* 0.3 HTML_IMAGE_ONLY_04 BODY: HTML: images with 0-400 bytes of words
* 0.1 HTML_SHORT_LINK_IMG_1 HTML is very short with a linked image
* 2.4 XM_PHPMAILER_FORGED Apparently forged header
* 0.0 RCVD_VIA_APNIC Received through Asia or Pacific islands (Oceana)
*
X-hMailServer-ExternalAccount: POPdaily
Return-Path: <paypal-security@paypal.com>
Received: from mailin4.hostvue.com (mailin4.hostvue.com [195.26.90.114]) (authenticated
user=danny@mydomain.co.uk bits=0) by ms4.hostvue.com (Cyrus v2.4.12-Kolab-2.4.12-1.el6)
with LMTPSA (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256/256
verify=YES); Thu, 26 Jun 2014 19:31:19 +0100
X-Sieve: CMU Sieve 2.4
Received: from sg2nlvphout02.shr.prod.sin2.secureserver.net ([182.50.132.196]) by
mailin4.hostvue.com with esmtp (Exim 4.80.1) (envelope-from <paypal-security@paypal.com>)
id 1X0ES6-00022y-Fd for danny@mydomain.co.uk; Thu, 26 Jun 2014 19:31:19
+0100
Received: from ip-118-139-182-12.ip.secureserver.net ([118.139.182.12]) by
sg2nlvphout02.shr.prod.sin2.secureserver.net with : DED : id
K6XE1o02P0GTTAm016XEaU; Thu, 26 Jun 2014 11:31:15 -0700
x-originating-ip: 118.139.182.12
Received: (qmail 12271 invoked by uid 10000); 26 Jun 2014 23:31:14 +0500
Date: Thu, 26 Jun 2014 23:31:14 +0500
To: danny@mydomain.co.uk
From: =?UTF-8?Q??= <paypal-security@paypal.com>
Reply-To: noreply@paypal.com
Subject: [SPAM] [6.3] Please Verify Your Paypal Account
Message-ID: <2df22738358f2114c64f8c82fdad2508@www.apnaplot.pk>
X-Priority: 3
X-Mailer: PHPMailer (phpmailer.sourceforge.net) [version ]
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/html; charset="iso-8859-1"
X-Spam-Prev-Subject: =?UTF-8?Q?Please_Verify_Your_Paypal_Account?=
X-hMailServer-Spam: YES
X-hMailServer-Reason-1: Tagged as Spam by SpamAssassin - (Score: 5)
X-hMailServer-Reason-Score: 5
hMailRule_DeletePhishSpam: Yes
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
I must be missing something, both of those mails are flagged as spam by spamassassin. Your required score is set to 3.
Re: Another Regular Expression Question
I must be missing something, both of those mails are flagged as spam by spamassassin. Your required score is set to 3.
They both have: X-Spam-Flag: Yes
so how have they slipped through ?
They both have: X-Spam-Flag: Yes
so how have they slipped through ?
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
They have been flagged as spam but not for the definitive reason of it being a phishing email and that I set the score of 3 by choice; if I had set it higher, say 4, then the first one wouldnt have been caught. Note that is the 1st email had adopted the BODY image type of the 2nd email (ie just an image) then it wouldnt have been scored 1.5 TVD_PH_BODY_ACCOUNTS_PRE and the overall score would have been 2.1 ...and NOT flagged as spam.
The reason I set to flag as a WARNING spam rather than delete is because we still get GENUINE emails coming in from our suppliers that get scored between 3 and around 5.8 - and we cant have the risk of such genuine mails being deleted. I have a score of 8 set as the delete threshold. My regex formula, though, definitely catches those specific emails without leaving anything to chance. If there was a spamassassin formula that effectively replicated my regex formula (narrowing down the risk) then it would be more reliable and I could have increased its score on a match.
Of course you might say "well why dont you write one then, Jim?": the answer is simply this (pick one):
a, if I have to write a formula myself thats local to my system then isnt that what I have already just done in hmailserver?
b, I have NO IDEA on how to write such things and add them to spamassassin (and Im sure you are already fed up with me picking your brains already )
The reason I set to flag as a WARNING spam rather than delete is because we still get GENUINE emails coming in from our suppliers that get scored between 3 and around 5.8 - and we cant have the risk of such genuine mails being deleted. I have a score of 8 set as the delete threshold. My regex formula, though, definitely catches those specific emails without leaving anything to chance. If there was a spamassassin formula that effectively replicated my regex formula (narrowing down the risk) then it would be more reliable and I could have increased its score on a match.
Of course you might say "well why dont you write one then, Jim?": the answer is simply this (pick one):
a, if I have to write a formula myself thats local to my system then isnt that what I have already just done in hmailserver?
b, I have NO IDEA on how to write such things and add them to spamassassin (and Im sure you are already fed up with me picking your brains already )
Last edited by jimimaseye on 2014-06-26 21:54, edited 1 time in total.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
check out the SA-learn.exe program
http://spamassassin.apache.org/full/3.3 ... learn.html
although if its just this one phishing source that is causing a problem then its probably not worth the hassle.
http://spamassassin.apache.org/full/3.3 ... learn.html
although if its just this one phishing source that is causing a problem then its probably not worth the hassle.
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Well, I was going to say that it looks to technical for my lame brain (Im not good from starting with instructions and better picking up and learning from others). But then I glanced down the page and saw this as part of the Getting Sytarted section:
"Build a significant sample of both ham and spam.
I suggest several thousand of each, placed in SPAM and HAM directories or mailboxes."
And with that, I got no chance! I dont have thousands, or even hundreds (barely 10's!) of spam. Our main spam nowadays is this phishing email that comes in to one specific account and is probably averaging 2 or 3 a day (the problem is the user is gullible enough to click one of the links which is probsably why only he gets these emails anyway). Mostly all other spam is EXTREMELY rare and is rightfully handled by the default spamassassin rules (viagra, russian babes etc).
"Build a significant sample of both ham and spam.
I suggest several thousand of each, placed in SPAM and HAM directories or mailboxes."
And with that, I got no chance! I dont have thousands, or even hundreds (barely 10's!) of spam. Our main spam nowadays is this phishing email that comes in to one specific account and is probably averaging 2 or 3 a day (the problem is the user is gullible enough to click one of the links which is probsably why only he gets these emails anyway). Mostly all other spam is EXTREMELY rare and is rightfully handled by the default spamassassin rules (viagra, russian babes etc).
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
BTW: just for completeness, he is one of those GNUINE emails that we receive (there is nothing dodgy about it) that scores 5.8. Note its not too disimilar to the dodgy ones above.
Code: Select all
Return-Path: bet-planning@oursupplier.nl
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on decerver
X-Spam-Flag: YES
X-Spam-Level: *****
X-Spam-Status: Yes, score=5.8 required=3.0 tests=DEAR_SOMETHING, HTML_IMAGE_ONLY_16,HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,KHOP_SC_CIDR8,
MIME_HTML_ONLY,RDNS_NONE,T_REMOTE_IMAGE autolearn=no version=3.3.2
X-Spam-Report:
* 1.7 DEAR_SOMETHING BODY: Contains 'Dear (something)'
* 0.0 HTML_MESSAGE BODY: HTML included in message
* 1.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
* 1.0 HTML_IMAGE_ONLY_16 BODY: HTML: images with 1200-1600 bytes of words
* 0.6 HTML_MIME_NO_HTML_TAG HTML-only message, but there is no HTML tag
* 1.3 RDNS_NONE Delivered to internal network by a host with no rDNS
* 0.0 T_REMOTE_IMAGE Message contains an external image
* 0.0 KHOP_SC_CIDR8 Relay CIDR /8 is among worst in SpamCop
*
X-hMailServer-ExternalAccount: POPdaily
Return-Path: <bet-planning@oursupplier.nl>
Received: from mailin5.hostvue.com (mailin5.hostvue.com [195.26.88.123]) (authenticated
user=sales@mydomain.co.uk bits=0) by ms4.hostvue.com (Cyrus v2.4.12-Kolab-2.4.12-1.el6)
with LMTPSA (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256/256
verify=YES); Thu, 26 Jun 2014 20:10:10 +0100
X-Sieve: CMU Sieve 2.4
Received: from [188.203.209.3] (helo=mail.coso.nl) by mailin5.hostvue.com with esmtps (TLSv1:RC4-MD5:128)
(Exim 4.80.1) (envelope-from <bet-planning@oursupplier.nl>) id
1X0F3h-00031Q-0z for SALES@mydomain.co.UK; Thu, 26 Jun 2014 20:10:09
+0100
X-MDAV-Processed: mail.coso.nl, Thu, 26 Jun 2014 21:13:21 +0200
Received: from tw-DataProvider ([195.225.34.232]) by coso.nl (mail.coso.nl) (MDaemon
PRO v11.0.3) with ESMTP id md50000102217.msg for <SALES@mydomain.co.UK>;
Thu, 26 Jun 2014 21:13:20 +0200
X-MDRemoteIP: 195.225.34.232
X-Return-Path: bet-planning@oursupplier.nl
X-Envelope-From: bet-planning@oursupplier.nl
X-MDaemon-Deliver-To: SALES@mydomain.co.UK
MIME-Version: 1.0
From: bet-planning@oursupplier.nl
To: SALES@mydomain.co.UK
Reply-To: planning@bet.be
Date: 26 Jun 2014 21:22:40 +0200
Subject: [SPAM] [5.8] Notification of new and/or changed deliveries
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Message-ID: <cmu-lmtpd-28197-1403809810-0@ms4.hostvue.com>
X-Spam-Prev-Subject: Notification of new and/or changed deliveries
X-hMailServer-Spam: YES
X-hMailServer-Reason-1: Tagged as Spam by SpamAssassin - (Score: 5)
X-hMailServer-Reason-2: The host name specified in HELO does not match IP address. - (Score: 2)
X-hMailServer-Reason-Score: 7
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
I presume the body text is different in each one.
Anyhow, you have a solution which is working for you now so problem solved.
Anyhow, you have a solution which is working for you now so problem solved.
Re: Another Regular Expression Question
OK I've been wondering how to create SA rules so I've had a little play and this is what I've come up with which seems to work fine.
Note we are now on PERL Regex implementation (same as unix I think).
And you can delete the describe line if you don't want it.
put above at end of your pathtoSA/etc/spamassassin/local.cf file
restart spamd service and try it.
one word of warning, if your other SA scores accumulate to less than zero it is possible it will still get through but unlikely. And note that you can add body tests into above as well. See references: http://wiki.apache.org/spamassassin/WritingRules
Note we are now on PERL Regex implementation (same as unix I think).
And you can delete the describe line if you don't want it.
Code: Select all
header __MY_PHISH_SUBJECT_1 Subject =~ /^.*(itune|apple|paypal|account).*$/i
header __MY_PHISH_SUBJECT_2 Subject =~ /^.*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit).*$/i
meta MY_PHISH_SUBJECT (__MY_PHISH_SUBJECT_1 && __MY_PHISH_SUBJECT_2)
score MY_PHISH_SUBJECT 7.0
describe MY_PHISH_SUBJECT My Phishing Rule
restart spamd service and try it.
one word of warning, if your other SA scores accumulate to less than zero it is possible it will still get through but unlikely. And note that you can add body tests into above as well. See references: http://wiki.apache.org/spamassassin/WritingRules
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Thanks for that percepts. It seems we are both on a learning curve right now then.
For your info, the 'body' type check actually already includes the SUBJECT line (as I found out with my testing and then confirmed by that RULES wiki page). So I only need the one rule (applied to 'body') to catch either positionings. Here is the new rule:
For your info, the 'body' type check actually already includes the SUBJECT line (as I found out with my testing and then confirmed by that RULES wiki page). So I only need the one rule (applied to 'body') to catch either positionings. Here is the new rule:
Code: Select all
body __MY_PHISH_BODY_1 /^.*(itune|apple|paypal|account).*$/i
body __MY_PHISH_BODY_2 /^.*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit|dear user|dear customer).*$/i
meta MY_PHISH_BODY (__MY_PHISH_BODY_1 && __MY_PHISH_BODY_2)
score MY_PHISH_BODY 8.0
describe MY_PHISH_BODY Apple/itunes/Paypal Body Phishing Rule
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
By the way, (for readers generally), just to give you the complete picture, all my emails at the moment come in via External Download so Spamassassin only scores but doesnt delete/prevent delivery. The automated deletion of high-scoring emails is done with a global rule in hmailserver that looks for the X-Spam-Level header and checks to see if it has at least 7 asterix (*******). This way it deletes all spam scored higher than 7 (in fact my 'delete' actually only moves it directly to TRASH giving the ability for a last resort double check if needed). Therefore I can set the score of my custom rule to anything I want (I could set it to 100 if I wanted). I do have the hmailser Spam settings doing proper (unrecoverable) deletions if a HMAILSERVER score achieves 8 or higher (and a marking of SPAM on achieving a score of 5 - 5 is also the score given when spamassassin marks an email, with 3, 2 & 2 also being awarded for SPF, HELO and DNS-MX record checks respectively). I choose this method rather than simply using the spamassassin score because sometimes spamassassin scores are not so consistent (same email checked twice can come back with different scores for some reason.
So to summarise my Spam check settings in Hmailserver are
Then a global rule that checks custom header X-Spam-Level header for ******* (7 asterix) and moving to 'Trash'.
So,
"[SPAM] [3.5] Here is your invoice/original subject..."
By doing this it safe guards against against rogue spamassassin rule scores that might cause a direct delete (if I had used "Use SA score = yes") without being able to double check. But of course, if SA has scored it, AND hmail scores it with its own checks as well (exceeding 8 ), then there is less doubt and therefore safer to wipeout without fear.
It may sound complicated, but its not really. And it works really quite well. (2350 emails, have only 35 retained as genuine falsely marked as [SPAM] (above a spamassassin score of 3), of which 19 of them are from 1 supplier who always scores 5.8 for some reason and the other 16 scoring 3.x something.)
So to summarise my Spam check settings in Hmailserver are
- Spam Delete Threshold = 8
SPF = 3
HELO = 2
DNS-MX = 2
Use Spamassasin = Yes
Use SA score= NO
Score = 5
Then a global rule that checks custom header X-Spam-Level header for ******* (7 asterix) and moving to 'Trash'.
So,
- if SA marks as spam, then hmail scores it a 5.
If hmail scores and adds its own checks to the score and it exceeds 8, it gets deleted, otherwise just marked '[SPAM]'.
else
if the hmail global rule sees SA has scored it higher than 7 then it gets Trashed (moved to Trash - as will be the case with the custom rule discussed earlier currently being marked as 8 ).
else
the message stays in the Inbox (with a subject appended as [SPAM])
"[SPAM] [3.5] Here is your invoice/original subject..."
By doing this it safe guards against against rogue spamassassin rule scores that might cause a direct delete (if I had used "Use SA score = yes") without being able to double check. But of course, if SA has scored it, AND hmail scores it with its own checks as well (exceeding 8 ), then there is less doubt and therefore safer to wipeout without fear.
It may sound complicated, but its not really. And it works really quite well. (2350 emails, have only 35 retained as genuine falsely marked as [SPAM] (above a spamassassin score of 3), of which 19 of them are from 1 supplier who always scores 5.8 for some reason and the other 16 scoring 3.x something.)
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
seems awful complicated but then its very rare I actually get any spam.
you can add a SA rule for that supplier domain/user and if its from them give it a negative score of say -5 so then they won't get marked as spam unless something changes.
you can add a SA rule for that supplier domain/user and if its from them give it a negative score of say -5 so then they won't get marked as spam unless something changes.
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Indeed, but its not really. All I have done above is documented the settings I have as offered by hmailserver and spamassassin. I guess the explanation WHY I have these settings may lead to a thought of confusion or seem complicated but I have simply set the 6 settings accordingly (finely balanced), and a rule to, capture all 3 eventualities (not spam, potentially spam, or 'defo dont want to see this muck' spam).
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
One little tip, add following to your local.cf
#X-Spam-Score: 9999.99
add_header spam Score _SCORE_
increases your hmail rule/function decision making options in an easy to use way
#X-Spam-Score: 9999.99
add_header spam Score _SCORE_
increases your hmail rule/function decision making options in an easy to use way
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Im not following you, I dont understand. I can add it as quoted, yes, but could you explain a bit better the purpose please?
Ta.
(I do already have this, is this the same thing?:
It gives me the SA score in [brackets] on the subject. And I also already have X-Spam-Level header in my mail headers giving the score in full asterix, ie **** for a score of 4).
Ta.
(I do already have this, is this the same thing?:
Code: Select all
rewrite_header Subject [_HITS_]
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
Yes HITS is same value but is deprecated in favour of score (hits still works though)
just put it in and see.
You get a header with just SA score in it (including deimal portion).
it means you can easily calculate what portion of hmail score comes from hmail and what portion from SA without going to string searches of subject or X-Spam-Status to find it.
i.e you have X-Spam_score value and X-hMailserver-Reason-Score value which can be used to determine what portion cam from hMail without doing string searches.
Just thought it may be useful, that's all, especially if you want to use the fractional portion of returned SA score in any tests when fine tuning whats spam and whats not.
just put it in and see.
You get a header with just SA score in it (including deimal portion).
it means you can easily calculate what portion of hmail score comes from hmail and what portion from SA without going to string searches of subject or X-Spam-Status to find it.
i.e you have X-Spam_score value and X-hMailserver-Reason-Score value which can be used to determine what portion cam from hMail without doing string searches.
Just thought it may be useful, that's all, especially if you want to use the fractional portion of returned SA score in any tests when fine tuning whats spam and whats not.
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Ok, understood. Cheers for that.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
HI percepts
Ive been trying to figure this out but cant. Maybe you have the knowledge....
the search line
/^.*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit|dear user|dear customer).*$/i
following on from the search string, I want it to continue to look for the word CONFIRM but I dont want it to include CONFIRMATION
I dont know how though and cant find it. (something like "...|froze|confirm[!ation]|rectif..... ??)
Any ideas?
Cheers chap.
Ive been trying to figure this out but cant. Maybe you have the knowledge....
the search line
/^.*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit|dear user|dear customer).*$/i
following on from the search string, I want it to continue to look for the word CONFIRM but I dont want it to include CONFIRMATION
I dont know how though and cant find it. (something like "...|froze|confirm[!ation]|rectif..... ??)
Any ideas?
Cheers chap.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
put a space after confirm
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
but that would stop the words "confirms", "confirmed" etc from being detected.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
but you didn't provide that as a specification so you'll get an answer that answers what you ask.
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
... |confirm[^a]| ...
Do you think this is correct? (going from http://www.zytrax.com/tech/web/regex.htm#brackets)
Do you think this is correct? (going from http://www.zytrax.com/tech/web/regex.htm#brackets)
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
try followig
confirm[^a]
see following for quick reference but be careful ^ only works as NOT inside class []
http://regexlib.com/CheatSheet.aspx
p.s. I think our mails crossed. You got there already.
confirm[^a]
see following for quick reference but be careful ^ only works as NOT inside class []
http://regexlib.com/CheatSheet.aspx
p.s. I think our mails crossed. You got there already.
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Ok its been some weeks now and I have been tweaking and tailoring the regex expression to maximise effect and minimise false positives.
For readers that may be interested, the following is the current (and so far proved optimum) version of my custom spamassassin rule (entered in my 'local.cf' file) that catches all them annoying 'paypal/apple/itunes - "account been frozen etc" phishing emails.
(Note: my rule scores it as 7.0 as this satisfies my personal spam catching setup which places all 7 or above directly into TRASH bin of the account) but users may wish to change the score to match their current scoring system. Oh, and of course, you can change the 'description to read whatever you want (I left it verbatim for manual checking of emails). Also, with this regex setup I have only had to whitelist 1 genuine address/sender out of all our senders to ensure it doesnt get accidentally trapped (from a hotel booking site when they send booking confirmations out because they quote the words 'account' and 'confirm' in their emails).
Here is the spamassassin rule
For readers that may be interested, the following is the current (and so far proved optimum) version of my custom spamassassin rule (entered in my 'local.cf' file) that catches all them annoying 'paypal/apple/itunes - "account been frozen etc" phishing emails.
(Note: my rule scores it as 7.0 as this satisfies my personal spam catching setup which places all 7 or above directly into TRASH bin of the account) but users may wish to change the score to match their current scoring system. Oh, and of course, you can change the 'description to read whatever you want (I left it verbatim for manual checking of emails). Also, with this regex setup I have only had to whitelist 1 genuine address/sender out of all our senders to ensure it doesnt get accidentally trapped (from a hotel booking site when they send booking confirmations out because they quote the words 'account' and 'confirm' in their emails).
Here is the spamassassin rule
Code: Select all
# Part 1 - look for inclusion of "itune/paypal" etc ("account" too dangerous for FROM)
header __MY_PHISH_HEAD_FROM From =~ /^.*([il]tune|app[li]e|paypal|amazon).*/i
body __MY_PHISH_BODY_1 /^.*([il]tune|app[li]e|paypal|amazon|account).*$/i
# Body Part 2 - look for key words/phrases as well (*CONFIRM* to exclude 'confirmAtion')
body __MY_PHISH_BODY_2 /^.*(verif|up ?date (your|the) (paypal|[il]tunes?|apple*|account)|up ?date your info|udapte|froze|confirm[^a]|rectif|expir|informations|suspend|restrict|(be|been|with|remove) limm?it|dear user|dear c[uo]st|dear ?,).*$/i
meta MY_PHISH_BODY ((__MY_PHISH_HEAD_FROM || __MY_PHISH_BODY_1) && __MY_PHISH_BODY_2)
score MY_PHISH_BODY 7.0
describe MY_PHISH_BODY Custom Apple/itunes/Paypal/Account Phishing (verif|update your/the [pay/itun/App/acc]|update your info|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|been/be/with/remove limit|dear user|dear c[ou]st|dear,)
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
Well done
You are now officially the hmail regex and spamassassin rule expert
You are now officially the hmail regex and spamassassin rule expert
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Now now Percepts, dont be so bashful.
For the record, 90% of the credit for this goes to you for your help - I merely tweaked the phrases being searched and chose method of checking the emails.
I still stand bemused by the seemingly random 'flavours' of regex out there and trying to find the correct one that spamassassin uses - nevermind actually understanding the common 'language' of it. I still have a couple of pages open to refer to and often find my experience not matching what the help pages suggest. Sometimes I have to refer to existing rules in spamassassin to see if I can make any sense of them and then copy the style.
And all this, over and over again for the last 3 weeks.....just for the quoted rule. THAT is how expert I am!!
Thanks again.
For the record, 90% of the credit for this goes to you for your help - I merely tweaked the phrases being searched and chose method of checking the emails.
I still stand bemused by the seemingly random 'flavours' of regex out there and trying to find the correct one that spamassassin uses - nevermind actually understanding the common 'language' of it. I still have a couple of pages open to refer to and often find my experience not matching what the help pages suggest. Sometimes I have to refer to existing rules in spamassassin to see if I can make any sense of them and then copy the style.
And all this, over and over again for the last 3 weeks.....just for the quoted rule. THAT is how expert I am!!
Thanks again.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Hi Percepts or (others)
One of the tricks the spammers are using is to use funny character sets that have a character that LOOKS like a normal ansi letter
For example, it will look like: "Dear iΤunes Customer," but the actual HTML code is "Dear iΤunes Customer"
See screen shot for exactly how it looks. Ive been really trying to work out how to regext code for this.
At the moment I have (for the standard word):
^.*itunes.*$/i
but how do I change this for the one that has the '&#'932' character? I simply cant work it out.
eg, ^.*i\x{932}unes.*$/i or something (btw this doesnt work, its an example)
Any ideas?
Cheers.
p.s We are talking about regex expressions for spamassassin rules
One of the tricks the spammers are using is to use funny character sets that have a character that LOOKS like a normal ansi letter
For example, it will look like: "Dear iΤunes Customer," but the actual HTML code is "Dear iΤunes Customer"
See screen shot for exactly how it looks. Ive been really trying to work out how to regext code for this.
At the moment I have (for the standard word):
^.*itunes.*$/i
but how do I change this for the one that has the '&#'932' character? I simply cant work it out.
eg, ^.*i\x{932}unes.*$/i or something (btw this doesnt work, its an example)
Any ideas?
Cheers.
p.s We are talking about regex expressions for spamassassin rules
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
you look for
;unes
and bin it if you find it on the basis that anyone using html character values for a capital T is probably a spammer.
;unes
and bin it if you find it on the basis that anyone using html character values for a capital T is probably a spammer.
Re: Another Regular Expression Question
to expand it a little you look forpercepts wrote:you look for
;unes
and bin it if you find it on the basis that anyone using html character values for a capital T is probably a spammer.
i(Τunes|tunes)
at the relevant point in your regex
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Hi Percepts
Tried that, is doesnt work.
I created this test rule:
body MY_TEST /^.*iΤunes.*$/i
score MY_TEST 5.0
but this rule does not get evaluated.
I have listed a cutdown version of the HTML-based email with the hooky word in it so you can save it and launch it to see the code yourself and evaluated spam headers yourself (all personal identifiable info changed.) Just copy and paste the code to notepad, and save as .EML.
You could also run a test on it yourself if you feel that way inclined. )
Tried that, is doesnt work.
I created this test rule:
body MY_TEST /^.*iΤunes.*$/i
score MY_TEST 5.0
but this rule does not get evaluated.
I have listed a cutdown version of the HTML-based email with the hooky word in it so you can save it and launch it to see the code yourself and evaluated spam headers yourself (all personal identifiable info changed.) Just copy and paste the code to notepad, and save as .EML.
You could also run a test on it yourself if you feel that way inclined. )
Code: Select all
Return-Path: ituens@appel.com
X-hMailServer-ExternalAccount: POPdaily
Return-Path: <arh24309@server83.creativ-hosting.de>
Received: from mail.hosting.com (mail.hosting.com [165.26.90.116]) (authenticated
user=user1@gmail.com bits=0) by ms4.hostvue.com (Cyrus v2.4.12-Kolab-2.4.12-1.el6)
with LMTPSA (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256/256
verify=YES); Mon, 04 Aug 2014 21:08:16 +0100
X-Sieve: CMU Sieve 2.4
Received: from server83.creativ-hosting.de ([91.250.83.61]) by mail.hosting.com
with esmtps (UNKNOWN:AES256-GCM-SHA384:256) (Exim 4.72) (envelope-from <arh24309@server83.creativ-hosting.de>)
id 1XEOYJ-0003ub-D7 for user1@gmail.com; Mon, 04 Aug 2014 21:08:16
+0100
Received: by server83.creativ-hosting.de (Postfix, from userid 10017) id DB2CD61D69;
Mon, 4 Aug 2014 22:07:47 +0200 (CEST)
Date: Mon, 4 Aug 2014 22:07:47 +0200
To: user1@gmail.com
From: =?UTF-8?Q?Apple?= <ituens@appel.com>
Subject: [SPAM] [9.7] (iTunes) Account disabled
Message-ID: <14df21d87d3d74f9216a8363df3ccacb@www.draht-rogel.de>
X-Priority: 3
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset="us-ascii"
X-Spam-Prev-Subject: =?UTF-8?Q?=28iTunes=29_Account_disabled?=
X-hMailServer-Spam: YES
X-hMailServer-Reason-1: Tagged as Spam by SpamAssassin - (Score: 5)
X-hMailServer-Reason-Score: 5
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
</head>
<body>
<span
style=3D"line-height: 17.04px; font-weight: bold;"><br
style=3D"line-height: 17.04px;">
<br style=3D"line-height: 17.04px;">
Dear iΤunes Customer,</span><br
</table>
</body>
</html>
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
I don't.You could also run a test on it yourself if you feel that way inclined
You develop and test your own solutions. You've had more than enough help/pointers on this. If you aren't capable of doing it yourself then find someone local to you and PAY them to do it for you.
Re: Another Regular Expression Question
I think you missed a lot of the conversation, and you are jumping to rash conclusions.Turtleneck wrote: This percepts guy sounds like he shouldnt be here.
Percepts is VERY helpful with the technical stuff, and has helped heaps.
We are all volunteers here.
Be nice and play nice, or don't bother coming back.
Who shouldn't be here are trolls that sign up just to be rude.
Just 'cause I link to a page and say little else doesn't mean I am not being nice.
https://www.hmailserver.com/documentation
https://www.hmailserver.com/documentation
Re: Another Regular Expression Question
It has obviously escaped you that this is an hmailserver support forum where we help, for the most part, people to get their hmail installation up and running when they are exeperiencing problems with it or resolving security loopholes etc. Users are expected to have a good deal of technical savvy if they are taking on being a mailserver administrator.
What it is NOT, is a prgramming tutorial site. It is not a Smapassssin Tutorial site. That is someone elses software. We help a little with getting SA installed but once the person wants constant help with tailoring someone elses software I draw the line. Goto the support forum for that software and not here. And this is not a Regex/perl tutorial or support forum either.
So why are you here? No don't bother to answer that. People here are busy and don't have time to waste on people who make posts which aren't hmail support questions.
What it is NOT, is a prgramming tutorial site. It is not a Smapassssin Tutorial site. That is someone elses software. We help a little with getting SA installed but once the person wants constant help with tailoring someone elses software I draw the line. Goto the support forum for that software and not here. And this is not a Regex/perl tutorial or support forum either.
So why are you here? No don't bother to answer that. People here are busy and don't have time to waste on people who make posts which aren't hmail support questions.
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Hello
Well I am shocked. I have to agree that Percepts has totally taken this somewhere where I didnt intend it to go. The winking eye at the end should have given a clue that I was being very tongue-in-cheek about 'running a test yourself'. But that said, its not unusual for people to do that themselves anyway without anyone asking them. The reason I gave the code was so that ANYONE, not only Percepts, can see the code I was dealing with and see why the suggesting was failing. And as Turtleneck says of course I wouldnt have been asking if I was able to do it. And if all forum responses were "test it yourself or go pay someone" then there would be no point to having this forum, would there.
Turtleneck: Percepts is/has been in the past very helpful. I do see why you said what you said but to be sure and avoid doubt he can be quite helpful. I would recommend to you using this forum anyway for your questions and queries as I have had a lot of help in the past (and there are many other users including Matt the moderator) who normally offer help one way or another (and dont ask to be paid for it. )
So, my query still stands: anyoe know how to compile a regex query that is able to catch that sort of character set/encoding as detailed in my report above? (I dont even know what the terminology is for such a thing. Is it some encoding, or html character set, I dont know). So I would be grateful of anyone's help.
p.s I would p1ss myself now if Turtleneck provides the answer.
EDIT: Just read your post Percepts. You didnt say that when the initial thread started entitled "Another Regular Expression Question" did you! Plus here is a 'new trick' for you to learn: find more polite ways of telling people you are no longer willing to help. Please.
Well I am shocked. I have to agree that Percepts has totally taken this somewhere where I didnt intend it to go. The winking eye at the end should have given a clue that I was being very tongue-in-cheek about 'running a test yourself'. But that said, its not unusual for people to do that themselves anyway without anyone asking them. The reason I gave the code was so that ANYONE, not only Percepts, can see the code I was dealing with and see why the suggesting was failing. And as Turtleneck says of course I wouldnt have been asking if I was able to do it. And if all forum responses were "test it yourself or go pay someone" then there would be no point to having this forum, would there.
Turtleneck: Percepts is/has been in the past very helpful. I do see why you said what you said but to be sure and avoid doubt he can be quite helpful. I would recommend to you using this forum anyway for your questions and queries as I have had a lot of help in the past (and there are many other users including Matt the moderator) who normally offer help one way or another (and dont ask to be paid for it. )
So, my query still stands: anyoe know how to compile a regex query that is able to catch that sort of character set/encoding as detailed in my report above? (I dont even know what the terminology is for such a thing. Is it some encoding, or html character set, I dont know). So I would be grateful of anyone's help.
p.s I would p1ss myself now if Turtleneck provides the answer.
EDIT: Just read your post Percepts. You didnt say that when the initial thread started entitled "Another Regular Expression Question" did you! Plus here is a 'new trick' for you to learn: find more polite ways of telling people you are no longer willing to help. Please.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
its all greek to me.
Re: Another Regular Expression Question
Lighten up guys. If you can't help or don't want to help, don't reply, simple as that.
Perhaps this site can help with your regex question http://www.regexe.com/
Perhaps this site can help with your regex question http://www.regexe.com/
If at first you don't succeed, bomb disposal probably isn't for you! ヅ
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Well said.
I have my answer courtesy of one of our fellow forum users. Valuable help and much appreciated too.
The answer, for future readers and reference, was
There are 2 main key points:
1, its a RAWBODY check (as opposed to BODY check) that is needed.
2, the opening ^.* and closing .*$ that I was originally using (from Percepts original suggestion early on in the thread) was causing the problem. Only when it was removed did this then get picked up.
Note: of course dont forget the \ for escaping the hash (otherwise it is classed as a start of a comment)
Thanks to all involved (when they did).
I have my answer courtesy of one of our fellow forum users. Valuable help and much appreciated too.
The answer, for future readers and reference, was
Code: Select all
rawbody /i&\#932;unes./i
1, its a RAWBODY check (as opposed to BODY check) that is needed.
2, the opening ^.* and closing .*$ that I was originally using (from Percepts original suggestion early on in the thread) was causing the problem. Only when it was removed did this then get picked up.
Note: of course dont forget the \ for escaping the hash (otherwise it is classed as a start of a comment)
Thanks to all involved (when they did).
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
I told you in the thread earlier that .* wast't "required" so don't try and cover your ignorance by blaming me. Furthermore you have broken forum rules by adding a new question onto the end of the last. And lastly I made precisely zero refernece to .* for this last question but again in your ignorance you have tried to apply something from an earlier question and got it wrong. If you were so smart you wouldn't be here asking questions. .* is NOT the reason for whatever you used being the fault. It is not possible for it to be wrong because of what its presence means. It is your own inability to get it right which was the problem.jimimaseye wrote:Well said.
I have my answer courtesy of one of our fellow forum users. Valuable help and much appreciated too.
The answer, for future readers and reference, was
There are 2 main key points:Code: Select all
rawbody /i&\#932;unes./i
1, its a RAWBODY check (as opposed to BODY check) that is needed.
2, the opening ^.* and closing .*$ that I was originally using (from Percepts original suggestion early on in the thread) was causing the problem. Only when it was removed did this then get picked up.
Note: of course dont forget the \ for escaping the hash (otherwise it is classed as a start of a comment)
Thanks to all involved (when they did).
- jimimaseye
- Moderator
- Posts: 10060
- Joined: 2011-09-08 17:48
Re: Another Regular Expression Question
Wow. Are you in a bad mood or something or is this normal? You really dont want to let this go?? Despite DooM's suggestion?
No one was "trying to blame" you and I dont think anyone would have read it as that. Your name was mention as a reference to earlier postings in the conversation with me. But you seem hell-bent on responding with a non-response and making ill feeling about this instead of taking it for what it is. And if you want to attack me for what I write, well ok, lets do it your way...
Regarding the $ and ^ :
Again, thats you NOT saying "remove .*", and instead simply referring to the inner content around the itunes word just as I had asked for. And THAT implies the rest of the quoted formula doesnt need correction (ergo is correct). You didnt say anythoing about changing or ensuring 'rawbody' as part of the expression either. And that is why I mentioned it in my summary.
"someone comes on asking a question about something they have already written and already know. How dare they think they know anything?! and that seems just bang out of order so Im not going to correct them and just chastise them for thinking they know everything...(despite them coming on the forum to ask for correction to the logic they have quoted)"
Oh, and whilst we are at pointing out your involvement which you were so very keen to raise and deny, furthermore, at the end I also said "Thanks to all involved (when they did)" which included YOU as a contributor to the early solutions of which my latest question was based around. For me, I had no problems with you (other than you obvious lack of sense of humour and misreading of a situation on when to reply and when not to reply, like now), and was/AM appreciative of your contributions to the solutions to my problems. But somehow, you seem keen on making me change my opinion and continue YOUR bad-tempered aggression. I dont know why. Is it because you feel an authority due to your contribution history and that makes you exempt from politeness? Its a shame. Your answers and help are sufficient to get respect. Your terse remarks and confrontations serve to negate it.
Im done. I will post here no more. I hate this. And if you wanted the 'get one over him' feeling, then purely for the fact I have felt I needed to respond to you in this manner means that youve got it. Well done, old dog.
No one was "trying to blame" you and I dont think anyone would have read it as that. Your name was mention as a reference to earlier postings in the conversation with me. But you seem hell-bent on responding with a non-response and making ill feeling about this instead of taking it for what it is. And if you want to attack me for what I write, well ok, lets do it your way...
Regarding the $ and ^ :
Yep. That's well and truly saying they arent required. You definitely insisted not to bother with them there .....apart from the few cases remaining when 'many' (leaving 'some' as it isnt ALL) didnt apply....and, err... "always". Oh look, you just ALL.)percepts wrote:In many cases they aren't required but I always put them in for clarity and to remove uncertainty,...
I never said .* was the problem either, I said "the opening ^.* and closing .*$." Dont quote me on something that was never there to be quoted (especially given youre doing such a bad job of reading your own quotes above).percepts wrote:And lastly I made precisely zero refernece to .* for this last question
Well yeah. Its my inability to get it right which leads me TO ASK FOR CORRECTION! So shoot me for using a forum. (How dare I?!).percepts wrote:.....in your ignorance you have tried to apply something from an earlier question and got it wrong. If you were so smart you wouldn't be here asking questions. .* is NOT the reason for whatever you used being the fault..... It is your own inability to get it right which was the problem.
following my initial statement that "At the moment I have (for the standard word): ^.*itunes.*$/i".percepts wrote:to expand it a little you look for
i(Τunes|tunes)
Again, thats you NOT saying "remove .*", and instead simply referring to the inner content around the itunes word just as I had asked for. And THAT implies the rest of the quoted formula doesnt need correction (ergo is correct). You didnt say anythoing about changing or ensuring 'rawbody' as part of the expression either. And that is why I mentioned it in my summary.
Youre absolutely right on this one. (Yey!!!) I have 'applied something from an earlier question", about REGEX formulas, on a thread entitled "Another Regular Expression Question", ...answered by you, ... in my "ignorance" because I dont know the answer, ....and yes if I WERE so smart knowing the answers I wouldnt be here asking the questions. Would I?!! So it seems your logic is:percepts wrote: in your ignorance you have tried to apply something from an earlier question and got it wrong. If you were so smart you wouldn't be here asking questions.
"someone comes on asking a question about something they have already written and already know. How dare they think they know anything?! and that seems just bang out of order so Im not going to correct them and just chastise them for thinking they know everything...(despite them coming on the forum to ask for correction to the logic they have quoted)"
Oh, and whilst we are at pointing out your involvement which you were so very keen to raise and deny, furthermore, at the end I also said "Thanks to all involved (when they did)" which included YOU as a contributor to the early solutions of which my latest question was based around. For me, I had no problems with you (other than you obvious lack of sense of humour and misreading of a situation on when to reply and when not to reply, like now), and was/AM appreciative of your contributions to the solutions to my problems. But somehow, you seem keen on making me change my opinion and continue YOUR bad-tempered aggression. I dont know why. Is it because you feel an authority due to your contribution history and that makes you exempt from politeness? Its a shame. Your answers and help are sufficient to get respect. Your terse remarks and confrontations serve to negate it.
Im done. I will post here no more. I hate this. And if you wanted the 'get one over him' feeling, then purely for the fact I have felt I needed to respond to you in this manner means that youve got it. Well done, old dog.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Another Regular Expression Question
Ask away Turtleneck, just put it in off topic discussions
I am locking this thread. Please all collect your toys and put them back in your pram. Thanks!
I am locking this thread. Please all collect your toys and put them back in your pram. Thanks!
If at first you don't succeed, bomb disposal probably isn't for you! ヅ