Simple rule is not working
Simple rule is not working
I need to prevent spammers from sending Chinese spam to my "nospam@example.com" email address. No-one else sends to "nospam" so may be they cannot read English. All of these emails include Chinese in the Subject and sometimes in the From line, so I created 2 rules:
1. If Subject contains "?" or "?utf" then delete email.
2. If From contains "?" then delete email.
The "?" does appear when Chinese is used, according to the email's source view, and tests from the hMailServer control panel do find a match, but it doesn't delete such mail. What am I missing?
1. If Subject contains "?" or "?utf" then delete email.
2. If From contains "?" then delete email.
The "?" does appear when Chinese is used, according to the email's source view, and tests from the hMailServer control panel do find a match, but it doesn't delete such mail. What am I missing?
Re: Simple rule is not working
From documentation:
"?" in raw view likely no more exists after decoding attempt.
besides, a real question mark Subject is common. i wouldn't fire a "Delete message".
https://www.hmailserver.com/documentati ... e=ts_rulesEncoded header fields
Headers of email messages may be encoded using different types ofencoding. If you set up a criterion that searches a header field, hMailServer will try to decode that header field if possible. If the header is in Japanese, for example, the contents of the mail will not be searchable.
"?" in raw view likely no more exists after decoding attempt.
besides, a real question mark Subject is common. i wouldn't fire a "Delete message".
Katip
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
Re: Simple rule is not working
Sure, but that would mean not searchable in "Japanese". For example the header data might look like...
Subject: =?utf-8?B?5LyB5Lia566h55CG6LWE5paZ?=
So why isn't it deleted?
I do not expect that any email sent to nospam@example.com to be of interest.besides, a real question mark Subject is common. i wouldn't fire a "Delete message".
Re: Simple rule is not working
Then simply delete it allKendo wrote: ↑2019-03-20 21:48I do not expect that any email sent to nospam@example.com to be of interest.
In SpamAssassin you can test for language and character sets
Also I use
xx.countries.nerd.dk as an DNSBL to check country of origin (yes I understand that may be different to what you are after)
Just 'cause I link to a page and say little else doesn't mean I am not being nice.
https://www.hmailserver.com/documentation
https://www.hmailserver.com/documentation
Re: Simple rule is not working
Can't do that because nospam@ needs to accept some critical mail. Cannot change nospam@ because to English speaking clients (95%) they get the message, ie: no spam to that address!
I don't use SpamAssassin because it can kill wanted mail. Only want to apply my rule to ONE account.In SpamAssassin you can test for language and character sets
I don't see why my rule doesn't work. Can you?
- jimimaseye
- Moderator
- Posts: 10053
- Joined: 2011-09-08 17:48
Re: Simple rule is not working
I use spamassassin and i rarely have false positives. By rare i mean one ory two a month. And most of the time they are hit by my personal rules and not the default published rules.
AND... spamassassin doesn't 'kill' anything, worse case scenario is it marks a mail as spam. Your mta is what deletes emails if you configure it to.
[Entered by mobile. Excuse my spelling.]
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Simple rule is not working
You do understand that the rules use SMTP envelope, not mail headers...
Have you confirmed in your logs that the address and subject do actually have the required text in them?
Also, can you please show a screen shot of you rules? Sometimes and/or gets confusing, I've assumed that you have two rules that aren't working. Is that correct? Screen shot really help to minimize the misunderstandings...
Just 'cause I link to a page and say little else doesn't mean I am not being nice.
https://www.hmailserver.com/documentation
https://www.hmailserver.com/documentation
Re: Simple rule is not working
Yes you are correct, sorry my bad
Perhaps the ? needs to be escaped with a \?
Just 'cause I link to a page and say little else doesn't mean I am not being nice.
https://www.hmailserver.com/documentation
https://www.hmailserver.com/documentation
Re: Simple rule is not working
AFAIK not only envelope, also any header, even whole body.
\? won't help (also as regex). rules apply after DEcoding (base64/qp) as i understand from documentation.
(SA could detect it with a "full" definition)
Katip
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
Re: Simple rule is not working
Combine Script and Rules.
Rule: "Exterminate"
Criteria - Custom Header "X-hMailServer-Exterminate" CONTAINS "YES"
Action - Oh well... DELETE.... Or get in line like Daleks and shout "EXTERMINATE"
The string "=?utf-8?B?5LyB5Lia566h55CG6LWE5paZ?=" is Base64 encoded using UTF-8 charset.
Code: Select all
Sub OnAcceptMessage(oClient, oMessage)
'
' Save state of Encode Flag
'
Dim ECFlag : ECFlag = oMessage.EncodeFields
'
' Disable Encoding of Header Fields
'
oMessage.EncodeFields = False
'
' Do stuff with RAW headers
'
If oMessage.Subject = "=?utf-8?B?5LyB5Lia566h55CG6LWE5paZ?=" Then _
oMessage.HeaderValue("X-hMailServer-Exterminate") = "YES"
'
' Save new header into message.
'
oMessage.Save
'
' Restore Encoding of Header Fields
'
oMessage.EncodeFields = ECFlag
End Sub
Criteria - Custom Header "X-hMailServer-Exterminate" CONTAINS "YES"
Action - Oh well... DELETE.... Or get in line like Daleks and shout "EXTERMINATE"
The string "=?utf-8?B?5LyB5Lia566h55CG6LWE5paZ?=" is Base64 encoded using UTF-8 charset.
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: Simple rule is not working
Well, that's like telling all white people to stay away...
You need to find other triggers. Can you paste the raw headers of a few messages?
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: Simple rule is not working
From: =?utf-8?B?5ZGo5pyJ5piO?= <gqvvwen@xxx.net>
Subject: =?utf-8?B?NDQzOTfjgIrmipbpn7PokKXplIDjgItndg==?=
From: =?utf-8?B?6am+6b6a5oq156e96Zif572a6K6w55qL?= <aiyis@yyy.mobi>
Subject: =?utf-8?B?MTA5OTk3MumUgOWUrueyvuiLsTLlpKnlvLrljJborq3nu4N5cWM=?= =?utf-8?B?NWUxdg==?=
From: "bssg" <anna@zzz.com>
Subject: =?gb2312?B?tPS459fu0MKz9sa3yKvM17mkvt/G68nPU02ztcSjYW15zPg=?= =?gb2312?B?tbDSstPDyc/By7/asazNzL6ryeTN6ru5uPjH5cDtuMm+u1C438fl?=
=?gb2312?B?1K2w5g==?=
From: =?utf-8?B?5LuO5pak?= <wzv@aaa.org>
Subject: =?utf-8?B?562U5aSNOueglCDlj5Eg55qEIOWboiDpmJ8g5p6EIOW7ug==?=
From: =?utf-8?B?6K646IyX56C+?= <pbvoz@bbb.com>
Subject: =?utf-8?B?5LyB5Lia566h55CG6LWE5paZ?=
Re: Simple rule is not working
Really ??Kendo wrote: ↑2019-03-21 23:12From: =?utf-8?B?5ZGo5pyJ5piO?= <gqvvwen@xxx.net>
Subject: =?utf-8?B?NDQzOTfjgIrmipbpn7PokKXplIDjgItndg==?=
From: =?utf-8?B?6am+6b6a5oq156e96Zif572a6K6w55qL?= <aiyis@yyy.mobi>
Subject: =?utf-8?B?MTA5OTk3MumUgOWUrueyvuiLsTLlpKnlvLrljJborq3nu4N5cWM=?= =?utf-8?B?NWUxdg==?=
From: "bssg" <anna@zzz.com>
Subject: =?gb2312?B?tPS459fu0MKz9sa3yKvM17mkvt/G68nPU02ztcSjYW15zPg=?= =?gb2312?B?tbDSstPDyc/By7/asazNzL6ryeTN6ru5uPjH5cDtuMm+u1C438fl?=
=?gb2312?B?1K2w5g==?=
From: =?utf-8?B?5LuO5pak?= <wzv@aaa.org>
Subject: =?utf-8?B?562U5aSNOueglCDlj5Eg55qEIOWboiDpmJ8g5p6EIOW7ug==?=
From: =?utf-8?B?6K646IyX56C+?= <pbvoz@bbb.com>
Subject: =?utf-8?B?5LyB5Lia566h55CG6LWE5paZ?=
Why are you redacting senders email, if they are sending SPAM? You don't expect us to believe they use domains like aaa, bbb, xxx, yyy and zzz? We are not stupid you know.
There are tons of other headers in the emails that can be used to identify UCE... I trap some of my SPAM by matching the "List-Unsubscribe" header.
From, To and Subject are easily faked.
This is what I'm working on atm rgds UCE and SPAM. https://www.hmailserver.com/forum/viewt ... 35#p209935
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: Simple rule is not working
Really? Sometimes spammers send using legitimate email addresses that are innocent!
How would you like having your email address scraped from a forum post?
Re: Simple rule is not working
They already tried ... and failed. I've been running my own mailserver for close to 20 years and my current domain is 16 years old. I first started working with The Internet around 1986 so I have first hand experience with a SPAM free Internet.
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: Simple rule is not working
Same here. My domain is more than 20 years old . Always had my own mail server and for 15 years provided an Internet service in a rural area for about 600 diallup modem users who attracted spam like magnets. I have seen spam in many forms. I use about 100 email addresses and recycle most which kills all of my spam... except in this instance.
You seem to be trying to block known domains of spammers. In my case that is of no use. Simply detecting "?" would solve my problem, if only that function worked.
Re: Simple rule is not working
Should I be looking for an encoded version of "?" and if so, what should I try?
- jimimaseye
- Moderator
- Posts: 10053
- Joined: 2011-09-08 17:48
Re: Simple rule is not working
Ahem.
Read the above.... now look at these examples below:
1)
2)
3)
And I have MANY MANY more. And all of them would fall foul of both of the 2 methods you both mentioned.
Kendo wrote: ↑2019-03-20 16:47I need to prevent spammers from sending Chinese spam to my "nospam@example.com" email address. No-one else sends to "nospam" so may be they cannot read English. All of these emails include Chinese in the Subject and sometimes in the From line, so I created 2 rules:
1. If Subject contains "?" or "?utf" then delete email.
2. If From contains "?" then delete email.
The "?" does appear when Chinese is used, according to the email's source view, and tests from the hMailServer control panel do find a match, but it doesn't delete such mail. What am I missing?
Read the above.... now look at these examples below:
1)
Contains both encoded Subject and contains List unsubscribe links. This email is not spam, it is a required, and has been requested.Subject: =?utf-8?Q?Must=20Read=20Style=20Guide=21=20How=20to=20Dress=20for=20Transition=20Season=F0=9F=8C=9E=E2=9D=84?=
From: =?utf-8?Q?The=20Tight=20Spot?= <marketing@thetightspot.net>
To: <grumblermail-stopspammingme@yahoo.net>
Date: Sun, 17 Mar 2019 07:05:48 +0000
X-Report-Abuse: Please report abuse for this campaign here: https://mailchimp.net/contact/abuse/?u= ... eaffe=1c7f
List-ID: 7f72a00046a31bfffc829mc list <7f72a00046a31bfffc829.50445.list-id.mcsv.co>
List-Unsubscribe: <https://thetightspot.us12.list-manage.n ... ea8c1ffce2>, <mailto:unsubscribe-mc.us12_7f72a000462c829.ea8c1ffce2-1cc7f5799@mailin.mcsv.co?subject=unsubscribe>
List-Unsubscribe-Post: List-Unsubscribe=One-Click
2)
Contains both a List unsubscribe and a "?" (question mark) in the subject. This is not spam, it is a standard email from a mailing list which I have requested.To: clamav-users@lists.clamav.co
From: Arnaud Jacques <webmaster@securiteinfo.net>
Subject: Re: [clamav-users] Database updated over unencrypted connection?
X-BeenThere: clamav-users@lists.clamav.co
Precedence: list
List-Id: ClamAV users ML <clamav-users.lists.clamav.co>
List-Unsubscribe: <https://lists.clamav.co/mailman/options/clamav-users>,
<mailto:clamav-users-request@lists.clamav.co?subject=unsubscribe>
List-Archive: <https://lists.clamav.co/pipermail/clamav-users/>
List-Post: <mailto:clamav-users@lists.clamav.co>
List-Help: <mailto:clamav-users-request@lists.clamav.co?subject=help>
List-Subscribe: <https://lists.clamav.co/mailman/listinfo/clamav-users>,
<mailto:clamav-users-request@lists.clamav.co?subject=subscribe>
Reply-To: ClamAV users ML <clamav-users@lists.clamav.co>
3)
Contains both list unsubscribe and a question mark. This is not spam, it is a authorised/requested facebook notification email.To: Gr <grumblerfriends-fb@yahoo.net>
Subject: =?UTF-8?B?8J+TtyBSaWNrIEF0a2luc29u?=
=?UTF-8?B?IGFkZGVkIGEgbmV3IHBo?=
=?UTF-8?B?b3Rv?=
Return-Path: notification@facebookmail.net
From: "Facebook" <notification@facebookmail.net>
Reply-to: noreply <noreply@facebookmail.net>
Errors-To: notification@facebookmail.net
X-Facebook-Notify: nf_photo_story; mailid=58461afd6c8d3G5af38ac572dcG584fa4
List-Unsubscribe: <https://www.facebook.net/o.php?k=AS0pdT ... f3ac572d14>
Feedback-ID: 0:nf_photo_story:Facebook
And I have MANY MANY more. And all of them would fall foul of both of the 2 methods you both mentioned.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Simple rule is not working
I shall repeat again... any mail sent to my nospam@example.com that contains Chinese or Japanese characters in either the Subject or From line is definitely SPAM. And usually without Unsubscribe.jimimaseye wrote: ↑2019-03-22 01:29
Contains both list unsubscribe and a question mark. This is not spam, it is a authorised/requested facebook notification email.
And I have MANY MANY more. And all of them would fall foul of both of the 2 methods you both mentioned.
So there is nothing falling foul in this case. How about an answer to my previous question?
Re: Simple rule is not working
I look for patterns and if found I have a way in. If there are no patterns then we are dealing with BOT's and I don't have the tech stuff to stop BOT's.Kendo wrote: ↑2019-03-22 00:47Same here. My domain is more than 20 years old . Always had my own mail server and for 15 years provided an Internet service in a rural area for about 600 diallup modem users who attracted spam like magnets. I have seen spam in many forms. I use about 100 email addresses and recycle most which kills all of my spam... except in this instance.
You seem to be trying to block known domains of spammers. In my case that is of no use. Simply detecting "?" would solve my problem, if only that function worked.
I get my fair share of "rouge" SPAM too. There is always some trigger that will give it away in some form or figure...
The latest I have is emails in English, German, Persian and Portugese. The common trigger is RegEx: (https:\/\/)(.*)(\.drive\.google\.com\/open\?id\=) in the body.
I filter on "HELO", "From", "X-Envelope-From", "Subject", "Body", BodyHTML", "List-Unsubscribe" and "IPRange".
I also check specific RBL's for "SnowShoe SPAM" and "LashBack SPAM" besides the usual RBL's, SURBL's AND my extremely well trained SpamAssassin.
I have 3 levels of SPAM. Level 1 is rejected (not received), Level 2 is scored up to 6 and goes into your SPAM folder AND is forwarded to SPAMTrap user, Level 3 scored above 6 is ONLY(!) forwarded to SPAMTrap user. If False Positives are found, the SPAMTrap user can distribute.
All users INBOX and SPAM folder are used in SpamAssassin training EVERY NIGHT ... PLUS ... Level 3 SPAM and a "FalsePositive" folder in the SPAMTrap user.
Users know that is they find SPAM in the INBOX they must move it to SPAM folder (not delete it!) and visa versa. SPAM folder is cleared from time to time by a script.
My domain cover my family and friends, 5 addresses currently. This week only 3 mails later identified as SPAM evaded my defences and 1 false positive (an acknowledgement that noone reads anyway )
The downside of my installation is the lack of statistical data. I sometimes have to wait a week or two to see if a particular filter work as intended.
I have all my filters in an XML file that I can change on-the-fly so I don't have to edit the script all the time.
I have custom logs that give me the "executive summary" every day, so general "maintenance" is only a 10-20 minute job.
I was wondering ... The "?" ... How does your OS display UTF-8 characters not implemented in your current codepage?
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
- jimimaseye
- Moderator
- Posts: 10053
- Joined: 2011-09-08 17:48
Re: Simple rule is not working
And how are you identifying whether its Chinese or Japanese? Your answer:Kendo wrote: ↑2019-03-22 01:43I shall repeat again... any mail sent to my nospam@example.com that contains Chinese or Japanese characters in either the Subject or From line is definitely SPAM.
(pointless stating 'utf?' because "utf?" contains "?")If Subject contains "?" or "?utf" then delete email.
And yet, as proven ANY email can contain a question mark.
How does a single question mark in a subject field (which by design can contain ANY character) define an email as unwanted or chinese? (None of my examples given earlier were japanese, chinese or even unwanted - they were all written in plain old English and one wasnt even encoded).
As for answering your question: I think you have been given your answer, or at least enough to explain why a, it doesnt work or b, its really SHOULDNT work (thankfully). "UTF?=" appearing is encode talk but rules apply to the DEcoded text (of which I doubt the decoded chinese text contains a "?"). And looking for "?" is just nuts and when matching will almost certainly be majority false positive in its success.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Simple rule is not working
If only it was that simple... If you can find some code somewhere that will identify the language used, then you are well on your way.Kendo wrote: ↑2019-03-22 01:43I shall repeat again... any mail sent to my nospam@example.com that contains Chinese or Japanese characters in either the Subject or From line is definitely SPAM. And usually without Unsubscribe.jimimaseye wrote: ↑2019-03-22 01:29
Contains both list unsubscribe and a question mark. This is not spam, it is a authorised/requested facebook notification email.
And I have MANY MANY more. And all of them would fall foul of both of the 2 methods you both mentioned.
So there is nothing falling foul in this case. How about an answer to my previous question?
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: Simple rule is not working
I just created a global rule named 'test ??'
If subject contains '?'
then set header value 'X-Tested' to 'Yes'
Then I sent a message from my gmail account to a hmailserver account and the logs show
"DEBUG" 26136 "2019-03-22 09:53:37.033" "Applying rule test ??"
"DEBUG" 26136 "2019-03-22 09:53:37.033" "Performing rule action"
The message appears twice in my inbox, once without the extra header, and once with it
Seems to me that contains '?' should work
If subject contains '?'
then set header value 'X-Tested' to 'Yes'
Then I sent a message from my gmail account to a hmailserver account and the logs show
"DEBUG" 26136 "2019-03-22 09:53:37.033" "Applying rule test ??"
"DEBUG" 26136 "2019-03-22 09:53:37.033" "Performing rule action"
The message appears twice in my inbox, once without the extra header, and once with it
Seems to me that contains '?' should work
Just 'cause I link to a page and say little else doesn't mean I am not being nice.
https://www.hmailserver.com/documentation
https://www.hmailserver.com/documentation
Re: Simple rule is not working
This ...
is actually a friend inviting to his 60'th birthday and a garden party. It would be really bad to miss that one because of a rule deleting "?UTF".
Code: Select all
Subject: =?UTF-8?Q?Fwd=3A_60_=C3=A5rs_f=C3=B8dselsdags=2Fsommer_fest?=
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: Simple rule is not working
I suspect the "?" is not really a "?", it is displayed as a "?" since the character is not in the codepage. I usually get small square boxes on my Windows XP but my Windows 10 IIRC shows a "?".mattg wrote: ↑2019-03-22 01:58I just created a global rule named 'test ??'
If subject contains '?'
then set header value 'X-Tested' to 'Yes'
Then I sent a message from my gmail account to a hmailserver account and the logs show
"DEBUG" 26136 "2019-03-22 09:53:37.033" "Applying rule test ??"
"DEBUG" 26136 "2019-03-22 09:53:37.033" "Performing rule action"
The message appears twice in my inbox, once without the extra header, and once with it
Seems to me that contains '?' should work
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: Simple rule is not working
What may be more relevant is how does Windows Server 2008 display the characters because that is where the filtering is actioned.
How many times do I need to repeat that any mail sent to my nospam@example.com that contains "?" in either Subject or From is unwanted mail? That is for me to decide, thank you very much!jimimaseye wrote: ↑2019-03-22 01:50looking for "?" is just nuts and when matching will almost certainly be majority false positive in its success.
Exactly, and that is my question, why doesn't it worK? How to rephrase "?"... back to my previous question...
Should I be looking for an encoded version of "?" and if so, what should I try?
Re: Simple rule is not working
Does hMailserver require a restart after applying a rule?
Re: Simple rule is not working
"X-Mailer: Microsoft Outlook Express...." is very significant in Chinese spam (along with Foxmail). likely a common default X-Mailer in their massmailing softwares. who uses OE anymore?
also "(From|Sender|Reply-To|Received|Resent-From)\:\s[^\n\r]+?(\@|\.)(qq|126|163)\.com" does good job in my SA.
Katip
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
Re: Simple rule is not working
No.
to your question: you can't detect "?" in an ENcoded header simply with a "contains ?" rule unless it actually contains a question mark in DEcoded (human readable) form.
Soren's suggestion would help (script + rules).
Katip
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
Re: Simple rule is not working
I also copied some of the subjects that you posted above to test, and they all get caught by my rule
Do you have a global rule or an account level rule?
I don't know how else I can test what you are experiencing
Do you have a global rule or an account level rule?
I don't know how else I can test what you are experiencing
Just 'cause I link to a page and say little else doesn't mean I am not being nice.
https://www.hmailserver.com/documentation
https://www.hmailserver.com/documentation
Re: Simple rule is not working
To test properly you may need to use the same characters in the Subject line, for example...
企业管理资料
It is an account level rule and only for ONE account.
Re: Simple rule is not working
Matt,
OP is not searching for a question mark you wrote in Subject or From, but the mail client when it encodes the message. these question marks are not available while rule processing. HMS first decodes (converts) all to human readable text before proceeding with rules.
OP is not searching for a question mark you wrote in Subject or From, but the mail client when it encodes the message. these question marks are not available while rule processing. HMS first decodes (converts) all to human readable text before proceeding with rules.
Katip
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
- jimimaseye
- Moderator
- Posts: 10053
- Joined: 2011-09-08 17:48
Re: Simple rule is not working
Indeed it is. You are free to do what you want, just as I am free to have an opinion on the choice. After all it is your mail server and as long as you are not sending spam to me and the world it simply doesnt affect me and you can take my advice or leave it as you wish. However, it is just as well that what you are requiring is not how to capture a "?", but how to capture encoded text that comes prefixed with "utf?" when looking at it in encoded form. And I think this has been here enough for you to help you achieve your goal. Although you have yet to say how you are choosing to distinguish chinese text away from any other genuine email that is using encoding. If you would implement spamassassin you could use: https://spamassassin.apache.org/full/3. ... xtCat.html. You might also find this useful: http://spamassassin.1065346.n5.nabble.c ... p8392.htmlKendo wrote: ↑2019-03-22 05:15How many times do I need to repeat that any mail sent to my nospam@example.com that contains "?" in either Subject or From is unwanted mail? That is for me to decide, thank you very much!jimimaseye wrote: ↑2019-03-22 01:50looking for "?" is just nuts and when matching will almost certainly be majority false positive in its success.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Simple rule is not working
SA Bayesian training is very powerfull. My "custom" filters are only applied if no other Anti SPAM measure tags an email.katip wrote: ↑2019-03-22 05:38"X-Mailer: Microsoft Outlook Express...." is very significant in Chinese spam (along with Foxmail). likely a common default X-Mailer in their massmailing softwares. who uses OE anymore?
also "(From|Sender|Reply-To|Received|Resent-From)\:\s[^\n\r]+?(\@|\.)(qq|126|163)\.com" does good job in my SA.
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: Simple rule is not working
when I copy that into the subject line of a message sent from gmail, it displays like that (企业管理资料) in Thunderbird after traveling through my hMailserver, and doesn't get caught by the rule...
When sending on my thunderbird via hmaislerver to the same account, again it shows as 企业管理资料
How do I get the subject '企业管理资料' to display as / be decoded as '?utf?' or whatever?
Just 'cause I link to a page and say little else doesn't mean I am not being nice.
https://www.hmailserver.com/documentation
https://www.hmailserver.com/documentation
Re: Simple rule is not working
https://www.hmailserver.com/forum/viewt ... 80#p210480mattg wrote: ↑2019-03-22 13:15when I copy that into the subject line of a message sent from gmail, it displays like that (企业管理资料) in Thunderbird after traveling through my hMailserver, and doesn't get caught by the rule...
When sending on my thunderbird via hmaislerver to the same account, again it shows as 企业管理资料
How do I get the subject '企业管理资料' to display as / be decoded as '?utf?' or whatever?
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: Simple rule is not working
But my point is that my subjects never look like this
"=?utf-8?B?5LyB5Lia566h55CG6LWE5paZ?="
They always look like
'企业管理资料'
"=?utf-8?B?5LyB5Lia566h55CG6LWE5paZ?="
They always look like
'企业管理资料'
Just 'cause I link to a page and say little else doesn't mean I am not being nice.
https://www.hmailserver.com/documentation
https://www.hmailserver.com/documentation
- jimimaseye
- Moderator
- Posts: 10053
- Joined: 2011-09-08 17:48
Re: Simple rule is not working
I believe it depends on the sending client encoding at time of sending.
[Entered by mobile. Excuse my spelling.]
[Entered by mobile. Excuse my spelling.]
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Simple rule is not working
yes, because your mail client makes it look so.
just like HMS rules see it. so, NO question marks.
have you seen message source? (raw content)
Katip
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
Re: Simple rule is not working
There are rules to abide by when encoding headers like Subject, From, To and Cc.jimimaseye wrote: ↑2019-03-22 14:29I believe it depends on the sending client encoding at time of sending.
[Entered by mobile. Excuse my spelling.]
https://tools.ietf.org/html/rfc1342
https://tools.ietf.org/html/rfc2047
Example:
Code: Select all
From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu>
To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>
CC: =?ISO-8859-1?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be>
Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
=?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: Simple rule is not working
probably this: TB -> Options, Display, Formatting, Advancedjimimaseye wrote: ↑2019-03-22 14:29I believe it depends on the sending client encoding at time of sending.
Katip
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
Re: Simple rule is not working
TB = Thunderbird?
That is of no use because it is the mail server applying the rule and not your local mail client.
That is of no use because it is the mail server applying the rule and not your local mail client.
- jimimaseye
- Moderator
- Posts: 10053
- Joined: 2011-09-08 17:48
Re: Simple rule is not working
I think you need a recap or summary of what had been said here:
1, "utf?" Appears due to the email being encoded
2, Searching for "?" (Or any other character) will only be searched for in the decoded string - it doesn't search for the character as in 'utf?' appearing in the raw source. Hmailserver always decodes emails (if encoded) before then doing string comparisons in rules.
3, there is no way of determining Chinese language or any other.
4, the Chinese or any other non western language or alphabet can send in text and will not have a "?" in it unless they deliberately type it in as part of their string (as in the end of a question).
5, email clients can be set to encode emails - it is the choice of the sender. (Knowing how makes no difference to you and your goal).
6, spamassassin is the best most reliable option (and probably the only option) for determining the language of the text in an email.
7, (a prediction: you will go defensive on this post because you insist the above can't be right due to it not fitting your needs and you still want to filter Chinese text with a rule.)
[Entered by mobile. Excuse my spelling.]
1, "utf?" Appears due to the email being encoded
2, Searching for "?" (Or any other character) will only be searched for in the decoded string - it doesn't search for the character as in 'utf?' appearing in the raw source. Hmailserver always decodes emails (if encoded) before then doing string comparisons in rules.
3, there is no way of determining Chinese language or any other.
4, the Chinese or any other non western language or alphabet can send in text and will not have a "?" in it unless they deliberately type it in as part of their string (as in the end of a question).
5, email clients can be set to encode emails - it is the choice of the sender. (Knowing how makes no difference to you and your goal).
6, spamassassin is the best most reliable option (and probably the only option) for determining the language of the text in an email.
7, (a prediction: you will go defensive on this post because you insist the above can't be right due to it not fitting your needs and you still want to filter Chinese text with a rule.)
[Entered by mobile. Excuse my spelling.]
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Simple rule is not working
We already got all of that, except for those clouding the issue by posting irrelevant suggestions.
Re: Simple rule is not working
I believe you got your answer in the second post...
But, let me make it perfectly clear to you. What you want to do ... YOU CAN'T!, IT WON'T WORK!. PERIOD!Encoded header fields
Headers of email messages may be encoded using different types ofencoding. If you set up a criterion that searches a header field, hMailServer will try to decode that header field if possible. If the header is in Japanese, for example, the contents of the mail will not be searchable.
If you want to filter on RAW ENCODED headers, you need to script like the example I posted.
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: Simple rule is not working
Hindsight? The functionality is in the scripting so why modify hMail? and why are you now including SA? SA is only an additonal tool.
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
- jimimaseye
- Moderator
- Posts: 10053
- Joined: 2011-09-08 17:48
Re: Simple rule is not working
I believe it is because he is still focusing on wanting to delete mail recognised as Chinese Japanese etc (which spamassassin does) and not understanding that the appearance and dealing with "utf8" (which your script deals with) has nothing to do with it. It seems he thinks spamassassin handles utf-8 better and therefore determines Chinese text more efficiently or, at least, it's functionality can be replicated in a rule.
[Entered by mobile. Excuse my spelling.]
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Simple rule is not working
IMO he was expecting that rules criteria can look up terms as seen in message source, just like opening an eml with notepad and Ctrl+F.
even if HMS rules section would have such an optional feature, it wouldn't make much sense, i.e. not worth to implement.
criteria (in this concept) must correspond to human brain. who would need to search for "YSBxdWVzdGlvbiBtYXJrIHByb2JsZW0=" just to find the term "a question mark problem" and who knows which message will arrive base64 encoded or in plain ASCII text? an ill-thought or how you name it.
even if HMS rules section would have such an optional feature, it wouldn't make much sense, i.e. not worth to implement.
criteria (in this concept) must correspond to human brain. who would need to search for "YSBxdWVzdGlvbiBtYXJrIHByb2JsZW0=" just to find the term "a question mark problem" and who knows which message will arrive base64 encoded or in plain ASCII text? an ill-thought or how you name it.
Katip
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
Re: Simple rule is not working
Well... I do not receive much SPAM from China, actually none ... so could someone please try this?
Rule: China!
Criteria (OR)
1: "From" RegEx "[\u2E80-\u2FD5\u3190-\u319f\u3400-\u4DBF\u4E00-\u9FCC\uF900-\uFAAD]"
2: "Subject" RegEx "[\u2E80-\u2FD5\u3190-\u319f\u3400-\u4DBF\u4E00-\u9FCC\uF900-\uFAAD]"
Action
1: Set header value "X-hMailServer-China" "YES"
I suspect there may be false positives if we run into EMOJI's
Rule: China!
Criteria (OR)
1: "From" RegEx "[\u2E80-\u2FD5\u3190-\u319f\u3400-\u4DBF\u4E00-\u9FCC\uF900-\uFAAD]"
2: "Subject" RegEx "[\u2E80-\u2FD5\u3190-\u319f\u3400-\u4DBF\u4E00-\u9FCC\uF900-\uFAAD]"
Action
1: Set header value "X-hMailServer-China" "YES"
I suspect there may be false positives if we run into EMOJI's
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: Simple rule is not working
Where I found this: https://stackoverflow.com/questions/136 ... in-unicode
Online RegEx tester: https://regex101.com/
IF I'm not mistaken then VBScript is using Javascript RegEx syntax and hMailServer RULES is using PERL RegEx syntax. Yeah!
Online MIME Decoder: http://dogmamix.com/MimeHeadersDecoder/The exact ranges for Chinese characters (except the extensions) are
[\u2E80-\u2FD5\u3190-\u319f\u3400-\u4DBF\u4E00-\u9FCC\uF900-\uFAAD].
1: [\u2e80-\u2fd5] https://unicode-table.com/en/blocks/cjk ... upplement/
CJK Radicals Supplement is a Unicode block containing alternative, often positional, forms of the Kangxi radicals. They are used headers in dictionary indices and other CJK ideograph collections organized by radical-stroke.
2: [\u3190-\u319f] https://unicode-table.com/en/blocks/kanbun/
Kanbun is a Unicode block containing annotation characters used in Japanese copies of classical Chinese texts, to indicate reading order.
3: [\u3400-\u4DBF] https://unicode-table.com/en/blocks/cjk ... tension-a/
CJK Unified Ideographs Extension-A is a Unicode block containing rare Han ideographs.
4: [\u4E00-\u9FCC] https://unicode-table.com/en/blocks/cjk ... deographs/
CJK Unified Ideographs is a Unicode block containing the most common CJK ideographs used in modern Chinese and Japanese.
5: [\uF900-\uFAAD] https://unicode-table.com/en/blocks/cjk ... deographs/
CJK Compatibility Ideographs is a Unicode block created to contain Han characters that were encoded in multiple locations in other established character encodings, in addition to their CJK Unified Ideographs assignments, in order to retain round-trip compatibility between Unicode and those encodings.
For the details please refer to here (https://unicode-table.com/en/#cjk-unifi ... xtension-a), and the extensions are provided in other answers.
Online RegEx tester: https://regex101.com/
IF I'm not mistaken then VBScript is using Javascript RegEx syntax and hMailServer RULES is using PERL RegEx syntax. Yeah!
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: Simple rule is not working
yes, it works.
we get tons of China spam (no complaint, thanks to SA). so i tried enough examples.
OP may find this solution useful.
Katip
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
- jimimaseye
- Moderator
- Posts: 10053
- Joined: 2011-09-08 17:48
Re: Simple rule is not working
Does it work for Japanese? Op wants to click Japanese as well. And question marks.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829
Re: Simple rule is not working
i found this but have no idea if it can be safely used in a combo solution for Chinese/Japanese/Korean spamjimimaseye wrote: ↑2019-03-23 17:36Does it work for Japanese? Op wants to click Japanese as well. And question marks.
https://www.key-shortcut.com/index.php?id=413&L=1
while we're slowly getting OT, it's interesting to learn that there is no strict question mark rule in Japanese & Chinese at all. from Wikipedia :
"Fullwidth question mark in East Asian languages
The question mark is also used in modern writing in Chinese and Japanese, although it is not strictly necessary in either. Usually it is written as fullwidth form in Chinese and Japanese, in Unicode: U+FF1F ? FULLWIDTH QUESTION MARK (HTML ?)."
Katip
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
Re: Simple rule is not working
The RegEx you tested is based on those tables.katip wrote: ↑2019-03-23 18:50i found this but have no idea if it can be safely used in a combo solution for Chinese/Japanese/Korean spamjimimaseye wrote: ↑2019-03-23 17:36Does it work for Japanese? Op wants to click Japanese as well. And question marks.
https://www.key-shortcut.com/index.php?id=413&L=1
Eg..
4: [\u4E00-\u9FCC] https://unicode-table.com/en/blocks/cjk ... deographs/
CJK Unified Ideographs is a Unicode block containing the most common CJK ideographs used in modern Chinese and Japanese.
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.
Re: Simple rule is not working
ahh yes, you were already referring to CJK
maybe the page i came across has it all in one lot
Katip
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
Re: Simple rule is not working
Well, lets see what OP has to say when he returns from the dead with a wine flu... It was Friday so chances are he's been going mach 5 to forget he ever asked the question
SørenR.
Woke is Marxism advancing through Maoist cultural revolution.
Woke is Marxism advancing through Maoist cultural revolution.