Another Regular Expression Question

Use this forum if you have installed hMailServer and want to ask a question related to a production release of hMailServer. Before posting, please read the troubleshooting guide. A large part of all reported issues are already described in detail here.
Locked
User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Another Regular Expression Question

Post by jimimaseye » 2014-06-25 16:43

Sorry, ghuys, I have spent hours trying to work this out by searching the forums andTRYING to understand the documentation about creating Regex expressions but it remains complete hieroglyphic to me. So forgive me for blatently (and himbly) requesting you help on this.

I am trying to create a rule using the REGEX test that simply says:

IF field ([contains WORDA] or [contains WORDB] or [contains WORDC]) AND ([contains STRING1] or [contains STRING2]) then...

eg
if subject (contains 'itune' or 'apple' or 'paypal') AND (contains 'suspend' or 'restrict')

This above example would match:

itunes account will be suspended - ('itunes' and 'suspend' exists together)
will restrict your paypal account - ('restrict' and 'paypal' exists together)

but would not match:

itunes will terminate your account - ('itunes' exists but not 'suspend' or 'restrict')

Im sure its easy-peasy......if you can understand the foreign language that is REGEX. (Something to do with searches within 'group1' and then also a search within 'group2'. I have no idea and have tried many permutations and supposed online regex generators).

I would appreciate all help offered.

Thanks

Jim.
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-25 17:38

Ok, I think Im on my way:

(.*([Aa]pple|[Ii][Tt]unes|[Pp]aypal).*)(.*(suspend|restrict).*) but still not quite right as it only works if the words are in order as listed.

ie,
Paypal will be restricted - WORKS
will restrict your Paypal - doesnt work as in wrong order.

Cant work out how to apply if ANY of the subsection words apply in ANY order.

Can anyone help?
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-06-25 18:45

make life easy and use two conditions utilising the rules AND so you get

when creating rule select "use AND" (which is the default)

first condition regex
(?i:^.*(itune|apple|paypal).*$)

second condition regex ( which is a logical AND because you selected it above)
(?i:^.*(suspend|restrict).*$)


Note above is case insensitive and this isn't fully tested but appears to work in hmail rules setup. It may or may not work in javascript (probably not) or VBScript (possibly not but you can force lowercase before doing the regex).

much simpler than you thought (famous last words).

The problem with regex is that you have to know it all and just when you get to the point where think you do, you find out you don't and its back to square one.

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-25 19:41

Hi Percepts

Yes, I will have to do that for now. But its not ideal as I want to run the same test against both the SUBJECT and the BODY. If I know the proper regex way to acheive the 'both' test in one command then my rule could be
where BODY = [regex] OR where SUBJECT = [regex]. As the test now has to be where FIELD = REGEX1 AND REGEX2 I cant do the OR for the separate fields (therefore needing the 2 rules instead of one).

So if the regex way does come to light...... it would be appreciated.

Cheers
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-06-25 20:04

it won't come to light unless you do it yourself. There is no AND operator in regex and you are trying to use it like a programming language. It isn't, its a pattern matching tool and making it do what you want would make it very long winded to get all the logic into it.

Write yourself some VBScript if you need that kind of logic.

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-25 20:43

ah. There is no 'AND' operator. Well that is that then. So 2 rules it is then.

Cheers.
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

User avatar
SorenR
Senior user
Senior user
Posts: 3217
Joined: 2006-08-21 15:38
Location: Denmark

Re: Another Regular Expression Question

Post by SorenR » 2014-06-25 20:53

jimimaseye wrote:Ok, I think Im on my way:

(.*([Aa]pple|[Ii][Tt]unes|[Pp]aypal).*)(.*(suspend|restrict).*) but still not quite right as it only works if the words are in order as listed.

ie,
Paypal will be restricted - WORKS
will restrict your Paypal - doesnt work as in wrong order.

Cant work out how to apply if ANY of the subsection words apply in ANY order.

Can anyone help?
/(.*(Apple|iTunes|Paypal).*)(.*(suspend|restrict).*)/ig

i = case insensitive
g = global search

Will match "itunes account will be suspended"

Playing with the online regex builder here -> http://regexr.com/ ;-)
SørenR.

“With age comes wisdom, but sometimes age comes alone.”
- Oscar Wilde

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-25 21:22

UPDATE:

The way to get it into the one rule:

subj ...regex... (paypal|itunes|apple).*(suspend|teminate)
OR
body ...regex... (paypal|itunes|apple).*(suspend|teminate)
OR
subj ...regex... (suspend|teminate).*(paypal|itunes|apple)
OR
body ...regex... (suspend|teminate).*(paypal|itunes|apple)

That should do it. That way its testing for either order or the words and in either of the fields.

In case anyoine is interested in my formula (to take on this endless Palpal/Apple ID/iTunes phisihing spam malarky):

(?i:^.*(itune|apple|paypal|account).*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit).*$)
and the reverse:
(?i:^.*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit).*(itune|apple|paypal|account).*$)

Obviously this will be subject to modification of the terms as new spam terminlogies get sent out and may not suite everyones situation (I am confident it suits my account).
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-06-25 21:34

Thats what I meant by long winded. It can be done in one regex but it requires several ORs and swapping stuff round.
There are more commands which look ahead that effectively are like an AND but its gets completely unreadable if you have a memory like a sieve(me) and rarely use regex.

I just use spamassassin and auto update my rules once a day. Much easier and I don't have to keep modifying regex patterns. SA do it for you.

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-25 22:03

I also have spamassassin with daily updates but it doesnt make any impact on these kind of emails at all. They all seem to slip through as new sources and methods. That is why I am creating this rule. (Very occasionally, one might get recognised and scored high enough but maybe only 1 in 7 - the rest slip through.)

I didnt think that was too long-winded really, it was just a case of inserting the new word into the list and then copy the line 3 times.
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-06-25 23:01

so what was all the fuss about... amazing what you can do if you try

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-25 23:10

Well, the first was that initially I couldn't get anything to work as I simply didn't understand how to start the query and secondly I wasn't aware that there was no AND operator. And actually it was your reply that helped me on my way. Now, with your help, I have my answer. Ta.
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-26 10:01

Hi Percepts

I was trying to make sense of the regex expression so was matching it to some online 'help' docs. I wondered about the starting ^ and the ending $ symbols. If I remove them and do a test match then the expression still seems to work. So I was wondering if i am misunderstanding something that you know better and whether I should leave them in as you stated? (I under the remaining parts of the expression although not sure of the formatting of the i: or the reliance of ?) Could you explain the ^ and $ please?

Here is my current expression (without the above symbols):

Code: Select all

(?i:.*(itune|apple|paypal|account).*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit|dear user|dear customer).*)
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-06-26 13:52

In many cases they aren't required but I always put them in for clarity and to remove uncertainty, especially when you come back and look at your regex 6 months later and have forgotten exacly how regex works or what your intention was at the time you wrote it.
And if you have an OR where you want to go back and look from start of line then ^ will be required and using $ makes it clear that this section of the regex is to end of line.

And bear in mind that there are different implementations of regex and how it works. Depends on the Regex implementation that was used for compiling hmailserver and hmailadmin as opposed to for example the implementation in PHP for webadmin. And unix and perl implemamtations may be different.
And that means you have to be very careful about where you reading up on regex usage because if its talking about unix or perl what you're reading may not apply to windows version of javscript for example or what hmail uses.

So beware of online javascript based testing tools, especially if you don't know whether the website is running on windows or unix and whther the page uses PHP or ASP or some other webscript language. Always test it on your target platform and software.

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-26 14:12

Ok, well for sure I understand the whole complications of ensuring I am looking at the correct version of regex! I have seen several conflicting opinions/versions of how to do things and i reckon this definitely added to my overall comfusion at the beginning.

Regarding the 2 symbols: ok, well without them it seems to work. But for clarity I will ensure they are included, and then in the future should I ever come to dabble in doing another regex expression and use this as a starting point, they may well be the difference to making my next one work or not.

Cheers.
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-06-26 14:53

you don't even need the .* at start and end if you remove the ^ and $ so I think you have a way to go yet.

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-26 15:13

I just tried that and it didnt seem to work. (removing those symbols and the .* at beginning and end resulted in a NO match - so will leave things as is.) It seems the .* is required at both ends.

test:

(?i:(itune|apple|paypal|account).*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit|dear user|dear customer))
against
"your itunes is suspended now"
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-06-26 15:36

well there you are, it works in my javascript tester but not in hmail which only goes to prove I should listen to my own advice and test in target OS and Software. :lol:

It could be that hmail assumes ^ and $ or it could be the implementation of regex that is being used, I don't know.

?i: deosn't work in my javascript tester but I knew that.

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-26 20:33

Here you go, Percepts, one of the phishing emails (that I aim to be capturing with my regex formula) has come in and been caught. I post the headers here and you can see how it completely slips through spamassassin (the last header is a custom header added by my rule so I can see why it ends up in the trash - rule or user)

Code: Select all

Return-Path: paypal@support.com
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on decerver
X-Spam-Flag: YES
X-Spam-Level: ***
X-Spam-Status: Yes, score=3.6 required=3.0 tests=HTML_MESSAGE,MIME_HTML_ONLY,
 SHORTENED_URL_SRC,TVD_PH_BODY_ACCOUNTS_PRE,T_RP_MATCHES_RCVD autolearn=no
 version=3.3.2
X-Spam-Report:
 * -0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay
 *    domain
 *  0.0 HTML_MESSAGE BODY: HTML included in message
 *  1.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
 *  1.0 SHORTENED_URL_SRC RAW: SHORTENED_URL_SRC
 *  1.5 TVD_PH_BODY_ACCOUNTS_PRE The body matches phrases such as "accounts
 *      suspended", "account credited", "account verification"
 *
X-hMailServer-ExternalAccount: POPdaily
Return-Path: <www-data@ne302.plesk.neen.it>
Received: from mailin2.hostvue.com (mailin2.hostvue.com [195.26.90.112]) (authenticated
 user=user@mydomain.co.uk bits=0) by ms4.hostvue.com (Cyrus v2.4.12-Kolab-2.4.12-1.el6)
 with LMTPSA (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256/256
 verify=YES); Thu, 26 Jun 2014 15:56:27 +0100
X-Sieve: CMU Sieve 2.4
Received: from ne302.plesk.neen.it ([85.159.145.120]) by mailin2.hostvue.com with
 esmtps (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from <www-data@ne302.plesk.neen.it>)
 id 1X0B6B-00081U-9x for user@mydomain.co.uk; Thu, 26 Jun 2014 15:56:27
 +0100
Received: by ne302.plesk.neen.it (Postfix, from userid 33) id 794D825824D; Thu, 26 Jun
 2014 16:56:26 +0200 (CEST)
To: user@mydomain.co.uk
Subject: [SPAM] [3.6] Your account has been suspended !
X-PHP-Originating-Script: 33:1.php
From: PayPal <paypal@support.com>
Reply-To: paypal@support.com
MIME-Version: 1.0
Content-Type: text/html
Message-Id: <20140626145626.794D825824D@ne302.plesk.neen.it>
Date: Thu, 26 Jun 2014 16:56:26 +0200 (CEST)
Content-Transfer-Encoding: quoted-printable
X-Spam-Prev-Subject: Your account has been suspended !
X-hMailServer-Spam: YES
X-hMailServer-Reason-1: Tagged as Spam by SpamAssassin - (Score: 5)
X-hMailServer-Reason-Score: 5
hMailRule_DeletePhishSpam: Yes
And here is another that just come in:

Code: Select all

Return-Path: paypal-security@paypal.com
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on decerver
X-Spam-Flag: YES
X-Spam-Level: ******
X-Spam-Status: Yes, score=6.3 required=3.0 tests=DKIM_ADSP_DISCARD, HTML_IMAGE_ONLY_04,HTML_MESSAGE,HTML_SHORT_LINK_IMG_1,MIME_HTML_ONLY,
 RCVD_VIA_APNIC,URI_NOVOWEL,XM_PHPMAILER_FORGED autolearn=no version=3.3.2
X-Spam-Report:
 *  1.8 DKIM_ADSP_DISCARD No valid author signature, domain signs all mail
 * and suggests discarding the rest
 *  0.5 URI_NOVOWEL URI: URI hostname has long non-vowel sequence
 *  0.0 HTML_MESSAGE BODY: HTML included in message
 *  1.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
 *  0.3 HTML_IMAGE_ONLY_04 BODY: HTML: images with 0-400 bytes of words
 *  0.1 HTML_SHORT_LINK_IMG_1 HTML is very short with a linked image
 *  2.4 XM_PHPMAILER_FORGED Apparently forged header
 *  0.0 RCVD_VIA_APNIC Received through Asia or Pacific islands (Oceana)
 *
X-hMailServer-ExternalAccount: POPdaily
Return-Path: <paypal-security@paypal.com>
Received: from mailin4.hostvue.com (mailin4.hostvue.com [195.26.90.114]) (authenticated
 user=danny@mydomain.co.uk bits=0) by ms4.hostvue.com (Cyrus v2.4.12-Kolab-2.4.12-1.el6)
 with LMTPSA (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256/256
 verify=YES); Thu, 26 Jun 2014 19:31:19 +0100
X-Sieve: CMU Sieve 2.4
Received: from sg2nlvphout02.shr.prod.sin2.secureserver.net ([182.50.132.196]) by
 mailin4.hostvue.com with esmtp (Exim 4.80.1) (envelope-from <paypal-security@paypal.com>)
 id 1X0ES6-00022y-Fd for danny@mydomain.co.uk; Thu, 26 Jun 2014 19:31:19
 +0100
Received: from ip-118-139-182-12.ip.secureserver.net ([118.139.182.12]) by
 sg2nlvphout02.shr.prod.sin2.secureserver.net with : DED : id
 K6XE1o02P0GTTAm016XEaU; Thu, 26 Jun 2014 11:31:15 -0700
x-originating-ip: 118.139.182.12
Received: (qmail 12271 invoked by uid 10000); 26 Jun 2014 23:31:14 +0500
Date: Thu, 26 Jun 2014 23:31:14 +0500
To: danny@mydomain.co.uk
From: =?UTF-8?Q??= <paypal-security@paypal.com>
Reply-To: noreply@paypal.com
Subject: [SPAM] [6.3] Please Verify Your Paypal Account
Message-ID: <2df22738358f2114c64f8c82fdad2508@www.apnaplot.pk>
X-Priority: 3
X-Mailer: PHPMailer (phpmailer.sourceforge.net) [version ]
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/html; charset="iso-8859-1"
X-Spam-Prev-Subject: =?UTF-8?Q?Please_Verify_Your_Paypal_Account?=
X-hMailServer-Spam: YES
X-hMailServer-Reason-1: Tagged as Spam by SpamAssassin - (Score: 5)
X-hMailServer-Reason-Score: 5
hMailRule_DeletePhishSpam: Yes
Note that this 2nd one doesnt have the TVD_PH_BODY_ACCOUNTS_PRE rule catching it or anything similar.
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-06-26 21:35

I must be missing something, both of those mails are flagged as spam by spamassassin. Your required score is set to 3.

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-06-26 21:36

I must be missing something, both of those mails are flagged as spam by spamassassin. Your required score is set to 3.
They both have: X-Spam-Flag: Yes

so how have they slipped through ?

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-26 21:47

They have been flagged as spam but not for the definitive reason of it being a phishing email and that I set the score of 3 by choice; if I had set it higher, say 4, then the first one wouldnt have been caught. Note that is the 1st email had adopted the BODY image type of the 2nd email (ie just an image) then it wouldnt have been scored 1.5 TVD_PH_BODY_ACCOUNTS_PRE and the overall score would have been 2.1 ...and NOT flagged as spam.

The reason I set to flag as a WARNING spam rather than delete is because we still get GENUINE emails coming in from our suppliers that get scored between 3 and around 5.8 - and we cant have the risk of such genuine mails being deleted. I have a score of 8 set as the delete threshold. My regex formula, though, definitely catches those specific emails without leaving anything to chance. If there was a spamassassin formula that effectively replicated my regex formula (narrowing down the risk) then it would be more reliable and I could have increased its score on a match.

Of course you might say "well why dont you write one then, Jim?": the answer is simply this (pick one):
a, if I have to write a formula myself thats local to my system then isnt that what I have already just done in hmailserver?
b, I have NO IDEA on how to write such things and add them to spamassassin (and Im sure you are already fed up with me picking your brains already ;-) )
Last edited by jimimaseye on 2014-06-26 21:54, edited 1 time in total.
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-06-26 21:51

check out the SA-learn.exe program

http://spamassassin.apache.org/full/3.3 ... learn.html

although if its just this one phishing source that is causing a problem then its probably not worth the hassle.

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-26 22:02

Well, I was going to say that it looks to technical for my lame brain (Im not good from starting with instructions and better picking up and learning from others). But then I glanced down the page and saw this as part of the Getting Sytarted section:

"Build a significant sample of both ham and spam.
I suggest several thousand of each, placed in SPAM and HAM directories or mailboxes.
"

And with that, I got no chance! I dont have thousands, or even hundreds (barely 10's!) of spam. Our main spam nowadays is this phishing email that comes in to one specific account and is probably averaging 2 or 3 a day (the problem is the user is gullible enough to click one of the links which is probsably why only he gets these emails anyway). Mostly all other spam is EXTREMELY rare and is rightfully handled by the default spamassassin rules (viagra, russian babes etc).
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-26 22:08

BTW: just for completeness, he is one of those GNUINE emails that we receive (there is nothing dodgy about it) that scores 5.8. Note its not too disimilar to the dodgy ones above.

Code: Select all

Return-Path: bet-planning@oursupplier.nl
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on decerver
X-Spam-Flag: YES
X-Spam-Level: *****
X-Spam-Status: Yes, score=5.8 required=3.0 tests=DEAR_SOMETHING, HTML_IMAGE_ONLY_16,HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,KHOP_SC_CIDR8,
 MIME_HTML_ONLY,RDNS_NONE,T_REMOTE_IMAGE autolearn=no version=3.3.2
X-Spam-Report: 
 *  1.7 DEAR_SOMETHING BODY: Contains 'Dear (something)'
 *  0.0 HTML_MESSAGE BODY: HTML included in message
 *  1.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
 *  1.0 HTML_IMAGE_ONLY_16 BODY: HTML: images with 1200-1600 bytes of words
 *  0.6 HTML_MIME_NO_HTML_TAG HTML-only message, but there is no HTML tag
 *  1.3 RDNS_NONE Delivered to internal network by a host with no rDNS
 *  0.0 T_REMOTE_IMAGE Message contains an external image
 *  0.0 KHOP_SC_CIDR8 Relay CIDR /8 is among worst in SpamCop
 *
X-hMailServer-ExternalAccount: POPdaily
Return-Path: <bet-planning@oursupplier.nl>
Received: from mailin5.hostvue.com (mailin5.hostvue.com [195.26.88.123]) (authenticated
 user=sales@mydomain.co.uk bits=0) by ms4.hostvue.com (Cyrus v2.4.12-Kolab-2.4.12-1.el6)
 with LMTPSA (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256/256
 verify=YES); Thu, 26 Jun 2014 20:10:10 +0100
X-Sieve: CMU Sieve 2.4
Received: from [188.203.209.3] (helo=mail.coso.nl) by mailin5.hostvue.com with esmtps (TLSv1:RC4-MD5:128)
 (Exim 4.80.1) (envelope-from <bet-planning@oursupplier.nl>) id
 1X0F3h-00031Q-0z for SALES@mydomain.co.UK; Thu, 26 Jun 2014 20:10:09
 +0100
X-MDAV-Processed: mail.coso.nl, Thu, 26 Jun 2014 21:13:21 +0200
Received: from tw-DataProvider ([195.225.34.232]) by coso.nl (mail.coso.nl) (MDaemon
 PRO v11.0.3) with ESMTP id md50000102217.msg for <SALES@mydomain.co.UK>;
 Thu, 26 Jun 2014 21:13:20 +0200
X-MDRemoteIP: 195.225.34.232
X-Return-Path: bet-planning@oursupplier.nl
X-Envelope-From: bet-planning@oursupplier.nl
X-MDaemon-Deliver-To: SALES@mydomain.co.UK
MIME-Version: 1.0
From: bet-planning@oursupplier.nl
To: SALES@mydomain.co.UK
Reply-To: planning@bet.be
Date: 26 Jun 2014 21:22:40 +0200
Subject: [SPAM] [5.8] Notification of new and/or changed deliveries
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Message-ID: <cmu-lmtpd-28197-1403809810-0@ms4.hostvue.com>
X-Spam-Prev-Subject: Notification of new and/or changed deliveries
X-hMailServer-Spam: YES
X-hMailServer-Reason-1: Tagged as Spam by SpamAssassin - (Score: 5)
X-hMailServer-Reason-2: The host name specified in HELO does not match IP address. - (Score: 2)
X-hMailServer-Reason-Score: 7
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-06-26 23:41

I presume the body text is different in each one.

Anyhow, you have a solution which is working for you now so problem solved.

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-06-27 00:55

OK I've been wondering how to create SA rules so I've had a little play and this is what I've come up with which seems to work fine.

Note we are now on PERL Regex implementation (same as unix I think).
And you can delete the describe line if you don't want it.

Code: Select all

header __MY_PHISH_SUBJECT_1  Subject =~  /^.*(itune|apple|paypal|account).*$/i
header __MY_PHISH_SUBJECT_2  Subject =~  /^.*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit).*$/i
meta MY_PHISH_SUBJECT  (__MY_PHISH_SUBJECT_1 && __MY_PHISH_SUBJECT_2)
score  MY_PHISH_SUBJECT 7.0
describe MY_PHISH_SUBJECT		My Phishing Rule
put above at end of your pathtoSA/etc/spamassassin/local.cf file

restart spamd service and try it.

one word of warning, if your other SA scores accumulate to less than zero it is possible it will still get through but unlikely. And note that you can add body tests into above as well. See references: http://wiki.apache.org/spamassassin/WritingRules

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-30 11:15

Thanks for that percepts. It seems we are both on a learning curve right now then. :-)

For your info, the 'body' type check actually already includes the SUBJECT line (as I found out with my testing and then confirmed by that RULES wiki page). So I only need the one rule (applied to 'body') to catch either positionings. Here is the new rule:

Code: Select all

body __MY_PHISH_BODY_1  /^.*(itune|apple|paypal|account).*$/i
body __MY_PHISH_BODY_2  /^.*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit|dear user|dear customer).*$/i
meta MY_PHISH_BODY  (__MY_PHISH_BODY_1 && __MY_PHISH_BODY_2)
score  MY_PHISH_BODY 8.0
describe MY_PHISH_BODY      Apple/itunes/Paypal Body Phishing Rule
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-30 15:41

By the way, (for readers generally), just to give you the complete picture, all my emails at the moment come in via External Download so Spamassassin only scores but doesnt delete/prevent delivery. The automated deletion of high-scoring emails is done with a global rule in hmailserver that looks for the X-Spam-Level header and checks to see if it has at least 7 asterix (*******). This way it deletes all spam scored higher than 7 (in fact my 'delete' actually only moves it directly to TRASH giving the ability for a last resort double check if needed). Therefore I can set the score of my custom rule to anything I want (I could set it to 100 if I wanted). I do have the hmailser Spam settings doing proper (unrecoverable) deletions if a HMAILSERVER score achieves 8 or higher (and a marking of SPAM on achieving a score of 5 - 5 is also the score given when spamassassin marks an email, with 3, 2 & 2 also being awarded for SPF, HELO and DNS-MX record checks respectively). I choose this method rather than simply using the spamassassin score because sometimes spamassassin scores are not so consistent (same email checked twice can come back with different scores for some reason.

So to summarise my Spam check settings in Hmailserver are
  • Spam Delete Threshold = 8
    SPF = 3
    HELO = 2
    DNS-MX = 2

    Use Spamassasin = Yes
    Use SA score= NO
    Score = 5
and I have Spamassassin marking mail as spam with a threshold score of 3.

Then a global rule that checks custom header X-Spam-Level header for ******* (7 asterix) and moving to 'Trash'.

So,
  • if SA marks as spam, then hmail scores it a 5.
    If hmail scores and adds its own checks to the score and it exceeds 8, it gets deleted, otherwise just marked '[SPAM]'.
    else
    if the hmail global rule sees SA has scored it higher than 7 then it gets Trashed (moved to Trash - as will be the case with the custom rule discussed earlier currently being marked as 8 ).
    else
    the message stays in the Inbox (with a subject appended as [SPAM])
- I also have spassassin actually tagging the subject with the spamassassin score. So any emails marked as spam but not deleted will look like:
"[SPAM] [3.5] Here is your invoice/original subject..."

By doing this it safe guards against against rogue spamassassin rule scores that might cause a direct delete (if I had used "Use SA score = yes") without being able to double check. But of course, if SA has scored it, AND hmail scores it with its own checks as well (exceeding 8 ), then there is less doubt and therefore safer to wipeout without fear.

It may sound complicated, but its not really. And it works really quite well. (2350 emails, have only 35 retained as genuine falsely marked as [SPAM] (above a spamassassin score of 3), of which 19 of them are from 1 supplier who always scores 5.8 for some reason and the other 16 scoring 3.x something.)
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-06-30 16:10

seems awful complicated but then its very rare I actually get any spam.

you can add a SA rule for that supplier domain/user and if its from them give it a negative score of say -5 so then they won't get marked as spam unless something changes.

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-06-30 16:24

Indeed, but its not really. All I have done above is documented the settings I have as offered by hmailserver and spamassassin. I guess the explanation WHY I have these settings may lead to a thought of confusion or seem complicated but I have simply set the 6 settings accordingly (finely balanced), and a rule to, capture all 3 eventualities (not spam, potentially spam, or 'defo dont want to see this muck' spam).
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-07-01 14:11

One little tip, add following to your local.cf

#X-Spam-Score: 9999.99
add_header spam Score _SCORE_

increases your hmail rule/function decision making options in an easy to use way

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-07-01 15:25

Im not following you, I dont understand. I can add it as quoted, yes, but could you explain a bit better the purpose please?

Ta.

(I do already have this, is this the same thing?:

Code: Select all

 rewrite_header Subject [_HITS_]
It gives me the SA score in [brackets] on the subject. And I also already have X-Spam-Level header in my mail headers giving the score in full asterix, ie **** for a score of 4).
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-07-01 15:47

Yes HITS is same value but is deprecated in favour of score (hits still works though)

just put it in and see.


You get a header with just SA score in it (including deimal portion).

it means you can easily calculate what portion of hmail score comes from hmail and what portion from SA without going to string searches of subject or X-Spam-Status to find it.

i.e you have X-Spam_score value and X-hMailserver-Reason-Score value which can be used to determine what portion cam from hMail without doing string searches.

Just thought it may be useful, that's all, especially if you want to use the fractional portion of returned SA score in any tests when fine tuning whats spam and whats not.

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-07-01 15:50

Ok, understood. Cheers for that.
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-07-09 17:27

HI percepts

Ive been trying to figure this out but cant. Maybe you have the knowledge....

the search line
/^.*(verif|update|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|limit|dear user|dear customer).*$/i

following on from the search string, I want it to continue to look for the word CONFIRM but I dont want it to include CONFIRMATION

I dont know how though and cant find it. (something like "...|froze|confirm[!ation]|rectif..... ??)

Any ideas?

Cheers chap.
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-07-09 17:37

put a space after confirm

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-07-09 17:39

but that would stop the words "confirms", "confirmed" etc from being detected.
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-07-09 17:41

but you didn't provide that as a specification so you'll get an answer that answers what you ask.

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-07-09 17:43

... |confirm[^a]| ...

Do you think this is correct? (going from http://www.zytrax.com/tech/web/regex.htm#brackets)
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-07-09 17:46

try followig

confirm[^a]

see following for quick reference but be careful ^ only works as NOT inside class []

http://regexlib.com/CheatSheet.aspx

p.s. I think our mails crossed. You got there already.

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-07-29 22:25

Ok its been some weeks now and I have been tweaking and tailoring the regex expression to maximise effect and minimise false positives.

For readers that may be interested, the following is the current (and so far proved optimum) version of my custom spamassassin rule (entered in my 'local.cf' file) that catches all them annoying 'paypal/apple/itunes - "account been frozen etc" phishing emails.

(Note: my rule scores it as 7.0 as this satisfies my personal spam catching setup which places all 7 or above directly into TRASH bin of the account) but users may wish to change the score to match their current scoring system. Oh, and of course, you can change the 'description to read whatever you want (I left it verbatim for manual checking of emails). Also, with this regex setup I have only had to whitelist 1 genuine address/sender out of all our senders to ensure it doesnt get accidentally trapped (from a hotel booking site when they send booking confirmations out because they quote the words 'account' and 'confirm' in their emails).

Here is the spamassassin rule

Code: Select all

#  Part 1 - look for inclusion of "itune/paypal" etc ("account" too dangerous for FROM)
header __MY_PHISH_HEAD_FROM From =~ /^.*([il]tune|app[li]e|paypal|amazon).*/i
body   __MY_PHISH_BODY_1            /^.*([il]tune|app[li]e|paypal|amazon|account).*$/i

# Body Part 2 - look for key words/phrases as well (*CONFIRM* to exclude 'confirmAtion')
body __MY_PHISH_BODY_2  /^.*(verif|up ?date (your|the) (paypal|[il]tunes?|apple*|account)|up ?date your info|udapte|froze|confirm[^a]|rectif|expir|informations|suspend|restrict|(be|been|with|remove) limm?it|dear user|dear c[uo]st|dear ?,).*$/i

meta MY_PHISH_BODY  ((__MY_PHISH_HEAD_FROM || __MY_PHISH_BODY_1) && __MY_PHISH_BODY_2)
score  MY_PHISH_BODY 7.0
describe MY_PHISH_BODY      Custom Apple/itunes/Paypal/Account Phishing (verif|update your/the [pay/itun/App/acc]|update your info|udapte|froze|confirm|rectif|expir|informations|suspend|restrict|been/be/with/remove limit|dear user|dear c[ou]st|dear,)
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-07-29 23:15

Well done :D

You are now officially the hmail regex and spamassassin rule expert :mrgreen:

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-07-29 23:27

Now now Percepts, dont be so bashful. :D

For the record, 90% of the credit for this goes to you for your help - I merely tweaked the phrases being searched and chose method of checking the emails.

I still stand bemused by the seemingly random 'flavours' of regex out there and trying to find the correct one that spamassassin uses - nevermind actually understanding the common 'language' of it. I still have a couple of pages open to refer to and often find my experience not matching what the help pages suggest. Sometimes I have to refer to existing rules in spamassassin to see if I can make any sense of them and then copy the style.

And all this, over and over again for the last 3 weeks.....just for the quoted rule. THAT is how expert I am!!

Thanks again.
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-08-05 01:15

Hi Percepts or (others)

One of the tricks the spammers are using is to use funny character sets that have a character that LOOKS like a normal ansi letter

For example, it will look like: "Dear iΤunes Customer," but the actual HTML code is "Dear i&#932;unes Customer"

See screen shot for exactly how it looks.
encode.png
encode.png (4.18 KiB) Viewed 18752 times
Ive been really trying to work out how to regext code for this.

At the moment I have (for the standard word):

^.*itunes.*$/i

but how do I change this for the one that has the '&#'932' character? I simply cant work it out.

eg, ^.*i\x{932}unes.*$/i or something (btw this doesnt work, its an example)

Any ideas?

Cheers.

p.s We are talking about regex expressions for spamassassin rules
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-08-05 02:23

you look for

;unes

and bin it if you find it on the basis that anyone using html character values for a capital T is probably a spammer.

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-08-05 02:59

percepts wrote:you look for

;unes

and bin it if you find it on the basis that anyone using html character values for a capital T is probably a spammer.
to expand it a little you look for

i(&#932;unes|tunes)

at the relevant point in your regex

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-08-05 10:24

Hi Percepts

Tried that, is doesnt work.

I created this test rule:

body MY_TEST /^.*i&#932;unes.*$/i
score MY_TEST 5.0

but this rule does not get evaluated.

I have listed a cutdown version of the HTML-based email with the hooky word in it so you can save it and launch it to see the code yourself and evaluated spam headers yourself (all personal identifiable info changed.) Just copy and paste the code to notepad, and save as .EML.

You could also run a test on it yourself if you feel that way inclined. ;-) )

Code: Select all

Return-Path: ituens@appel.com
X-hMailServer-ExternalAccount: POPdaily
Return-Path: <arh24309@server83.creativ-hosting.de>
Received: from mail.hosting.com (mail.hosting.com [165.26.90.116]) (authenticated
 user=user1@gmail.com bits=0) by ms4.hostvue.com (Cyrus v2.4.12-Kolab-2.4.12-1.el6)
 with LMTPSA (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256/256
 verify=YES); Mon, 04 Aug 2014 21:08:16 +0100
X-Sieve: CMU Sieve 2.4
Received: from server83.creativ-hosting.de ([91.250.83.61]) by mail.hosting.com
 with esmtps (UNKNOWN:AES256-GCM-SHA384:256) (Exim 4.72) (envelope-from <arh24309@server83.creativ-hosting.de>)
 id 1XEOYJ-0003ub-D7 for user1@gmail.com; Mon, 04 Aug 2014 21:08:16
 +0100
Received: by server83.creativ-hosting.de (Postfix, from userid 10017) id DB2CD61D69;
 Mon,  4 Aug 2014 22:07:47 +0200 (CEST)
Date: Mon, 4 Aug 2014 22:07:47 +0200
To: user1@gmail.com
From: =?UTF-8?Q?Apple?= <ituens@appel.com>
Subject: [SPAM] [9.7] (iTunes) Account disabled
Message-ID: <14df21d87d3d74f9216a8363df3ccacb@www.draht-rogel.de>
X-Priority: 3
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset="us-ascii"
X-Spam-Prev-Subject: =?UTF-8?Q?=28iTunes=29_Account_disabled?=
X-hMailServer-Spam: YES
X-hMailServer-Reason-1: Tagged as Spam by SpamAssassin - (Score: 5)
X-hMailServer-Reason-Score: 5

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>
<head>
</head>
<body>
<span

 style=3D"line-height: 17.04px; font-weight: bold;"><br

 style=3D"line-height: 17.04px;">

            <br style=3D"line-height: 17.04px;">

Dear i&#932;unes Customer,</span><br
</table>
</body>
</html>

HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-08-05 15:09

You could also run a test on it yourself if you feel that way inclined
I don't.

You develop and test your own solutions. You've had more than enough help/pointers on this. If you aren't capable of doing it yourself then find someone local to you and PAY them to do it for you.

User avatar
mattg
Moderator
Moderator
Posts: 20244
Joined: 2007-06-14 05:12
Location: 'The Outback' Australia

Re: Another Regular Expression Question

Post by mattg » 2014-08-05 17:19

Turtleneck wrote: This percepts guy sounds like he shouldnt be here.
I think you missed a lot of the conversation, and you are jumping to rash conclusions.

Percepts is VERY helpful with the technical stuff, and has helped heaps.

We are all volunteers here.

Be nice and play nice, or don't bother coming back.
Who shouldn't be here are trolls that sign up just to be rude.
Just 'cause I link to a page and say little else doesn't mean I am not being nice.
https://www.hmailserver.com/documentation

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-08-05 18:01

It has obviously escaped you that this is an hmailserver support forum where we help, for the most part, people to get their hmail installation up and running when they are exeperiencing problems with it or resolving security loopholes etc. Users are expected to have a good deal of technical savvy if they are taking on being a mailserver administrator.

What it is NOT, is a prgramming tutorial site. It is not a Smapassssin Tutorial site. That is someone elses software. We help a little with getting SA installed but once the person wants constant help with tailoring someone elses software I draw the line. Goto the support forum for that software and not here. And this is not a Regex/perl tutorial or support forum either.

So why are you here? No don't bother to answer that. People here are busy and don't have time to waste on people who make posts which aren't hmail support questions.

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-08-05 18:08

Hello

Well I am shocked. I have to agree that Percepts has totally taken this somewhere where I didnt intend it to go. The winking eye at the end should have given a clue that I was being very tongue-in-cheek about 'running a test yourself'. But that said, its not unusual for people to do that themselves anyway without anyone asking them. The reason I gave the code was so that ANYONE, not only Percepts, can see the code I was dealing with and see why the suggesting was failing. And as Turtleneck says of course I wouldnt have been asking if I was able to do it. And if all forum responses were "test it yourself or go pay someone" then there would be no point to having this forum, would there.

Turtleneck: Percepts is/has been in the past very helpful. I do see why you said what you said but to be sure and avoid doubt he can be quite helpful. I would recommend to you using this forum anyway for your questions and queries as I have had a lot of help in the past (and there are many other users including Matt the moderator) who normally offer help one way or another (and dont ask to be paid for it. ;-) )

So, my query still stands: anyoe know how to compile a regex query that is able to catch that sort of character set/encoding as detailed in my report above? (I dont even know what the terminology is for such a thing. Is it some encoding, or html character set, I dont know). So I would be grateful of anyone's help.

p.s I would p1ss myself now if Turtleneck provides the answer. :-)

EDIT: Just read your post Percepts. You didnt say that when the initial thread started entitled "Another Regular Expression Question" did you! Plus here is a 'new trick' for you to learn: find more polite ways of telling people you are no longer willing to help. Please.
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-08-05 20:58

its all greek to me.

^DooM^
Site Admin
Posts: 13861
Joined: 2005-07-29 16:18
Location: UK

Re: Another Regular Expression Question

Post by ^DooM^ » 2014-08-05 23:16

Lighten up guys. If you can't help or don't want to help, don't reply, simple as that.

Perhaps this site can help with your regex question http://www.regexe.com/
If at first you don't succeed, bomb disposal probably isn't for you! ヅ

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-08-05 23:28

Well said.

I have my answer courtesy of one of our fellow forum users. Valuable help and much appreciated too.

The answer, for future readers and reference, was

Code: Select all

rawbody     /i&\#932;unes./i
There are 2 main key points:

1, its a RAWBODY check (as opposed to BODY check) that is needed.
2, the opening ^.* and closing .*$ that I was originally using (from Percepts original suggestion early on in the thread) was causing the problem. Only when it was removed did this then get picked up.

Note: of course dont forget the \ for escaping the hash (otherwise it is classed as a start of a comment)

Thanks to all involved (when they did).
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

percepts
Senior user
Senior user
Posts: 5282
Joined: 2009-10-20 16:33
Location: Sceptred Isle

Re: Another Regular Expression Question

Post by percepts » 2014-08-06 02:05

jimimaseye wrote:Well said.

I have my answer courtesy of one of our fellow forum users. Valuable help and much appreciated too.

The answer, for future readers and reference, was

Code: Select all

rawbody     /i&\#932;unes./i
There are 2 main key points:

1, its a RAWBODY check (as opposed to BODY check) that is needed.
2, the opening ^.* and closing .*$ that I was originally using (from Percepts original suggestion early on in the thread) was causing the problem. Only when it was removed did this then get picked up.

Note: of course dont forget the \ for escaping the hash (otherwise it is classed as a start of a comment)

Thanks to all involved (when they did).
I told you in the thread earlier that .* wast't "required" so don't try and cover your ignorance by blaming me. Furthermore you have broken forum rules by adding a new question onto the end of the last. And lastly I made precisely zero refernece to .* for this last question but again in your ignorance you have tried to apply something from an earlier question and got it wrong. If you were so smart you wouldn't be here asking questions. .* is NOT the reason for whatever you used being the fault. It is not possible for it to be wrong because of what its presence means. It is your own inability to get it right which was the problem.

User avatar
jimimaseye
Moderator
Moderator
Posts: 8159
Joined: 2011-09-08 17:48

Re: Another Regular Expression Question

Post by jimimaseye » 2014-08-06 03:58

Wow. Are you in a bad mood or something or is this normal? You really dont want to let this go?? Despite DooM's suggestion?

No one was "trying to blame" you and I dont think anyone would have read it as that. Your name was mention as a reference to earlier postings in the conversation with me. But you seem hell-bent on responding with a non-response and making ill feeling about this instead of taking it for what it is. And if you want to attack me for what I write, well ok, lets do it your way...

Regarding the $ and ^ :
percepts wrote:In many cases they aren't required but I always put them in for clarity and to remove uncertainty,...
Yep. That's well and truly saying they arent required. You definitely insisted not to bother with them there .....apart from the few cases remaining when 'many' (leaving 'some' as it isnt ALL) didnt apply....and, err... "always". Oh look, you just ALL.)
percepts wrote:And lastly I made precisely zero refernece to .* for this last question
I never said .* was the problem either, I said "the opening ^.* and closing .*$." Dont quote me on something that was never there to be quoted (especially given youre doing such a bad job of reading your own quotes above).
percepts wrote:.....in your ignorance you have tried to apply something from an earlier question and got it wrong. If you were so smart you wouldn't be here asking questions. .* is NOT the reason for whatever you used being the fault..... It is your own inability to get it right which was the problem.
Well yeah. Its my inability to get it right which leads me TO ASK FOR CORRECTION! So shoot me for using a forum. (How dare I?!).
percepts wrote:to expand it a little you look for

i(&#932;unes|tunes)
following my initial statement that "At the moment I have (for the standard word): ^.*itunes.*$/i".
Again, thats you NOT saying "remove .*", and instead simply referring to the inner content around the itunes word just as I had asked for. And THAT implies the rest of the quoted formula doesnt need correction (ergo is correct). You didnt say anythoing about changing or ensuring 'rawbody' as part of the expression either. And that is why I mentioned it in my summary.
percepts wrote: in your ignorance you have tried to apply something from an earlier question and got it wrong. If you were so smart you wouldn't be here asking questions.
Youre absolutely right on this one. (Yey!!!) I have 'applied something from an earlier question", about REGEX formulas, on a thread entitled "Another Regular Expression Question", ...answered by you, ... in my "ignorance" because I dont know the answer, ....and yes if I WERE so smart knowing the answers I wouldnt be here asking the questions. Would I?!! So it seems your logic is:
"someone comes on asking a question about something they have already written and already know. How dare they think they know anything?! and that seems just bang out of order so Im not going to correct them and just chastise them for thinking they know everything...(despite them coming on the forum to ask for correction to the logic they have quoted)"

Oh, and whilst we are at pointing out your involvement which you were so very keen to raise and deny, furthermore, at the end I also said "Thanks to all involved (when they did)" which included YOU as a contributor to the early solutions of which my latest question was based around. For me, I had no problems with you (other than you obvious lack of sense of humour and misreading of a situation on when to reply and when not to reply, like now), and was/AM appreciative of your contributions to the solutions to my problems. But somehow, you seem keen on making me change my opinion and continue YOUR bad-tempered aggression. I dont know why. Is it because you feel an authority due to your contribution history and that makes you exempt from politeness? Its a shame. Your answers and help are sufficient to get respect. Your terse remarks and confrontations serve to negate it.

Im done. I will post here no more. I hate this. And if you wanted the 'get one over him' feeling, then purely for the fact I have felt I needed to respond to you in this manner means that youve got it. Well done, old dog.
HMS 5.6.6 B2383 on Win Server 2008 R2 Foundation, + 5.6.7-B2415 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

^DooM^
Site Admin
Posts: 13861
Joined: 2005-07-29 16:18
Location: UK

Re: Another Regular Expression Question

Post by ^DooM^ » 2014-08-07 20:58

Ask away Turtleneck, just put it in off topic discussions :)

I am locking this thread. Please all collect your toys and put them back in your pram. Thanks!
If at first you don't succeed, bomb disposal probably isn't for you! ヅ

Locked