As part of my pURI-BL project, I want to create a custom ruleset. I can write basic rules, but my issue here is how that I'm looking to test body against hundreds or thousands of URLs, so its *probably* too long to create a regex string from all the URLs.
Ideally, a custom plugin would be better so the ruleset could be created using one URL per line, but I'm totally lost with perl.
Any hints or ideas?
Need help creating rule
Re: Need help creating rule
i would suggest to test the body prior to delivery.
to avoid performance issues just first 100 lines might be sufficient i think.
oMessage.Body (HTMLBody) is a string and i suppose it can be VBS Split by VbCrLf. you RegEx-check first 100 lines, if any contains an URL (see below for a good pattern) extract and push it into an array, then lookup each item in your DB table.
according to the result (bad URL found or not) you add a header such as X-HMS-BadURL = True
then you can play with rules as you like based on existence of this header..
just throwing out of my head
to avoid performance issues just first 100 lines might be sufficient i think.
oMessage.Body (HTMLBody) is a string and i suppose it can be VBS Split by VbCrLf. you RegEx-check first 100 lines, if any contains an URL (see below for a good pattern) extract and push it into an array, then lookup each item in your DB table.
Code: Select all
(("[^<>@\\]+")|([^<> @\"]+))@(\[([0-9]{1,3}\.){3}[0-9]{1,3}\]|(?=.{1,255}$)((?!-|\.)[a-zA-Z0-9-]{0,62}[a-zA-Z0-9])(|\.(?!-|\.)[a-zA-Z0-9-]{0,62}[a-zA-Z0-9]){1,126})
then you can play with rules as you like based on existence of this header..
just throwing out of my head
Katip
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
Re: Need help creating rule
sorry for nonsense pattern. here a good one:
https://regexr.com/39nr7
https://regexr.com/39nr7
Katip
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
--
HMS 5.7, MariaDB 10.4.10, SA 4.0.0, ClamAV 0.103.8
Re: Need help creating rule
Good idea. I already split the body at "</head>" when it exists so I don't pick up things like w3.org and style urls.katip wrote: ↑2022-05-16 09:39i would suggest to test the body prior to delivery.
to avoid performance issues just first 100 lines might be sufficient i think.
oMessage.Body (HTMLBody) is a string and i suppose it can be VBS Split by VbCrLf. you RegEx-check first 100 lines, if any contains an URL (see below for a good pattern) extract and push it into an array, then lookup each item in your DB table.according to the result (bad URL found or not) you add a header such as X-HMS-BadURL = TrueCode: Select all
(("[^<>@\\]+")|([^<> @\"]+))@(\[([0-9]{1,3}\.){3}[0-9]{1,3}\]|(?=.{1,255}$)((?!-|\.)[a-zA-Z0-9-]{0,62}[a-zA-Z0-9])(|\.(?!-|\.)[a-zA-Z0-9-]{0,62}[a-zA-Z0-9]){1,126})
then you can play with rules as you like based on existence of this header..
just throwing out of my head