Best way to mtach BTC address

This section contains scripts that hMailServer has contributed with. hMailServer 5 is needed to use these.
Post Reply
glenluo
Senior user
Senior user
Posts: 274
Joined: 2011-07-03 12:10

Best way to mtach BTC address

Post by glenluo » 2022-08-03 16:20

What is the best regex to match BTC address? Something like below two forms:

bc1qm93ujfwsr40wnkrmcr62fkgz950fwguvq7a7m4
bc1qma28jg7kw2uk9cfhaxc27s2z0d3wa9m9pfht7u
bc1qmcyr7507jtkhnq95asur7yr6t5erpsj8nr4jzx
bc1qm2neqx0g95ft33ue8qvdvhmznqy8cppwgr8wp3
bc1q67uzflp6epem4uwceq5807655e5dzegx96tjvr

128a74yNjE2iKCZSkVMApvHxYZnUxF2uQ6
12ELWfXgRgqhtt8KenQbAfuBbAb1Rd3GJ7
12ZLU11k4T1MCk8Kc5uxweTqTSamPiK5fD
12djMjPKd6Bv2BaXUNVuAjnuusKA66qCkX
12kieSEdCV4ikxdXXXC23ZsDcNmmKrRmwA
12mojuzVPeSh26qmA9GLJFADo4chgGbtj3
12qyWVJL65VaKiUx1DFbCYckQcowXHGWs4
199ZxJd71PJkMjjTdQdh7ekmQ2amE1tVe6

User avatar
jimimaseye
Moderator
Moderator
Posts: 9587
Joined: 2011-09-08 17:48

Re: Best way to mtach BTC address

Post by jimimaseye » 2022-08-03 17:01

This:

(^1\w{33}$)|(^bc1q\w{38}$)

https://regex101.com/

that matches anything beginning with a "1" and then 33 other characters

OR

anything beginning "bc1q" and then 38 more characters.

Adjust accordingly.
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

glenluo
Senior user
Senior user
Posts: 274
Joined: 2011-07-03 12:10

Re: Best way to mtach BTC address

Post by glenluo » 2022-08-04 03:45

Function oLookup(strRegEx, strMatch, bGlobal)
With CreateObject("VBScript.RegExp")
.Pattern = strRegEx
.Global = bGlobal
.MultiLine = True
.IgnoreCase = True
Set oLookup = .Execute(strMatch)
End With
End Function

Sub OnAcceptMessage(oClient, oMessage)

Dim strRegEx, Match, Matches
strRegEx ="(^1\w{33}$)|(^bc1q\w{38}$)"
Set Matches = oLookup(strRegEx, oMessage.Body, False)
If Matches.Count > 0 Then
For Each Match In Matches
Result.Value = 2
Result.Message = "The email body with bitcoin address is consindered spam!"
Next
End If

End Sub

User avatar
RvdH
Senior user
Senior user
Posts: 2316
Joined: 2008-06-27 14:42
Location: The Netherlands

Re: Best way to mtach BTC address

Post by RvdH » 2022-08-04 07:45

Not sure if those get better, more accurate results as the one jimimaseye posted, but here are some more

Code: Select all

^(bc1|[13])[a-zA-HJ-NP-Z0-9]{25,39}$
https://ihateregex.io/expr/bitcoin-address/

Code: Select all

^(?:[13]{1}[a-km-zA-HJ-NP-Z1-9]{26,33}|bc1[a-z0-9]{39,59})$
https://regexland.com/regex-bitcoin-addresses/
CIDR to RegEx: d-fault.nl/cidrtoregex
DNS Lookup: d-fault.nl/dnstools
DKIM Generator: d-fault.nl/dkimgenerator
DNSBL Lookup: d-fault.nl/dnsbllookup
GEOIP Lookup: d-fault.nl/geoiplookup

User avatar
RvdH
Senior user
Senior user
Posts: 2316
Joined: 2008-06-27 14:42
Location: The Netherlands

Re: Best way to mtach BTC address

Post by RvdH » 2022-08-04 09:01

glenluo wrote:
2022-08-04 03:45
Function oLookup(strRegEx, strMatch, bGlobal)
With CreateObject("VBScript.RegExp")
.Pattern = strRegEx
.Global = bGlobal
.MultiLine = True
.IgnoreCase = True
Set oLookup = .Execute(strMatch)
End With
End Function

Sub OnAcceptMessage(oClient, oMessage)

Dim strRegEx, Match, Matches
strRegEx ="(^1\w{33}$)|(^bc1q\w{38}$)"
Set Matches = oLookup(strRegEx, oMessage.Body, False)
If Matches.Count > 0 Then
For Each Match In Matches
Result.Value = 2
Result.Message = "The email body with bitcoin address is consindered spam!"
Next
End If

End Sub

Code: Select all

strRegEx ="(^1\w{33}$)|(^bc1q\w{38}$)"
How big is the chance the bitcoin address will start on a new line and is not prefixed/suffixed by anything, eg: space?

Maybe something like this would be better, check if it starts on a newline or after a space, als can be suffixed by a space

Code: Select all

(?:\s|^)((?:[13]{1}[a-km-zA-HJ-NP-Z1-9]{26,33}|bc1[a-z0-9]{39,59}))(?:\s|$)
CIDR to RegEx: d-fault.nl/cidrtoregex
DNS Lookup: d-fault.nl/dnstools
DKIM Generator: d-fault.nl/dkimgenerator
DNSBL Lookup: d-fault.nl/dnsbllookup
GEOIP Lookup: d-fault.nl/geoiplookup

User avatar
SorenR
Senior user
Senior user
Posts: 5533
Joined: 2006-08-21 15:38
Location: Denmark

Re: Best way to mtach BTC address

Post by SorenR » 2022-08-04 13:00

If you want a sample base for testing your RegEx I suggest https://www.bitcoinabuse.com/reports :mrgreen:
SørenR.

There are two types of people in this world:
1) Those who can extrapolate from incomplete data

User avatar
RvdH
Senior user
Senior user
Posts: 2316
Joined: 2008-06-27 14:42
Location: The Netherlands

Re: Best way to mtach BTC address

Post by RvdH » 2022-08-04 13:13

SorenR wrote:
2022-08-04 13:00
If you want a sample base for testing your RegEx I suggest https://www.bitcoinabuse.com/reports :mrgreen:
SpamAssassin > 3.4.2 & HashBL
https://btcblack.it/
CIDR to RegEx: d-fault.nl/cidrtoregex
DNS Lookup: d-fault.nl/dnstools
DKIM Generator: d-fault.nl/dkimgenerator
DNSBL Lookup: d-fault.nl/dnsbllookup
GEOIP Lookup: d-fault.nl/geoiplookup

User avatar
SorenR
Senior user
Senior user
Posts: 5533
Joined: 2006-08-21 15:38
Location: Denmark

Re: Best way to mtach BTC address

Post by SorenR » 2022-08-04 13:35

My own "bitcoin checker"

Code: Select all

        '
        '   Bitcoin "BODYTXT"
        '
        strRegEx = myListsRegEx(myListsDict, "//Bitcoin/Bodytxt")
        If strRegEx <> "VOID" Then
            If oMessage.HasBodyType("text/plain") Then
                Set oMatchCollection = oLookup(strRegEx, oMessage.Body, True)
                For Each oMatch In oMatchCollection
                    Call myListsStat(myListsDict, oMatch.Value)
                    EventLogX.Write( LPad("Bitcoin", 15, " ") & vbTab & oClient.IPAddress )
                Next
                If oMatchCollection.Count > 0 Then Exit Do
            End If
            If oMessage.HasBodyType("text/html") Then
                Set oMatchCollection = oLookup(strRegEx, HTML2Text(oMessage.HTMLBody), True)
                For Each oMatch In oMatchCollection
                   Call myListsStat(myListsDict, oMatch.Value)
                   EventLogX.Write( LPad("Bitcoin (HTML)", 15, " ") & vbTab & oClient.IPAddress )
                Next
                If oMatchCollection.Count > 0 Then Exit Do
            End If
        End If
The interesting line here is: Set oMatchCollection = oLookup(strRegEx, HTML2Text(oMessage.HTMLBody), True)

Code: Select all

Function HTML2text(html)

    On Error Resume Next
    Err.Clear

    Dim oSCtrl
    Set oSCtrl = CreateObject("MSScriptControl.ScriptControl")
    oSCtrl.AllowUI = 0
    oSCtrl.Language = "JSCript"
    oSCtrl.Reset()
    oSCtrl.AddCode("function convertToPlain(html){var tempDivElement = document.createElement(""div"");tempDivElement.innerHTML = html;return tempDivElement.textContent || tempDivElement.innerText || """";}")
    HTML2text = oSCtrl.Run("convertToPlain", html)

    On Error GoTo 0
    If (Err.number <> 0) Then
        EventLog.Write( "ERROR: Function HTML2text(html)" )
        EventLog.Write( "Error       : " & Err.number )
        EventLog.Write( "Error (hex) : 0x" & Hex(Err.number) )
        EventLog.Write( "Source      : " & Err.Source )
        EventLog.Write( "Description : " & Err.Description )
        Err.Clear
    End If

End Function
The rule breaker here is this: CreateObject("MSScriptControl.ScriptControl") BUT unfortunately it is a 32 bit ActiveX thing.

Fear not - someone (NOT Microsoft) made a 64bit version.

https://github.com/tablacus/TablacusScr ... ag/1.2.5.2

Once the module is registered with your Windows 10 or whatever (I use it on Windows 10 AND Windows Server 2019) the only change will be from:

CreateObject("MSScriptControl.ScriptControl") ==> CreateObject("ScriptControl")

That's it!

I have a few other uses for the ScriptControl:

Code: Select all

Function UrlEncode(s)
    Dim oSCtrl
    Set oSCtrl = CreateObject("MSScriptControl.ScriptControl")
    oSCtrl.Language = "JScript"
    UrlEncode = oSCtrl.CodeObject.encodeURIComponent(s)
    UrlEncode = Replace(UrlEncode, "'", "%27")
    UrlEncode = Replace(UrlEncode, """", "%22")
End Function

Function UrlDecode(s)
    Dim oSCtrl
    Set oSCtrl = CreateObject("MSScriptControl.ScriptControl")
    oSCtrl.Language = "JScript"
    UrlDecode = Replace(s, "+", " ")
    UrlDecode = oSCtrl.CodeObject.decodeURIComponent(UrlDecode)
End Function

' Uses Windows internal document format conversion of HTML to plain text.
Function HTML2text(html)

    On Error Resume Next
    Err.Clear

    Dim oSCtrl
    Set oSCtrl = CreateObject("MSScriptControl.ScriptControl")
    oSCtrl.AllowUI = 0
    oSCtrl.Language = "JSCript"
    oSCtrl.Reset()
    oSCtrl.AddCode("function convertToPlain(html){var tempDivElement = document.createElement(""div"");tempDivElement.innerHTML = html;return tempDivElement.textContent || tempDivElement.innerText || """";}")
    HTML2text = oSCtrl.Run("convertToPlain", html)

    On Error GoTo 0
    If (Err.number <> 0) Then
        EventLog.Write( "ERROR: Function HTML2text(html)" )
        EventLog.Write( "Error       : " & Err.number )
        EventLog.Write( "Error (hex) : 0x" & Hex(Err.number) )
        EventLog.Write( "Source      : " & Err.Source )
        EventLog.Write( "Description : " & Err.Description )
        Err.Clear
    End If

End Function

' A VBScript wrapper for the Java "JSon" function
Function jsonDecode(jsonString)
    On Error Resume Next
    Dim oSCtrl
    Set oSCtrl = CreateObject("MSScriptControl.ScriptControl")
    oSCtrl.Language = "JScript"
    Set jsonDecode = oSCtrl.Eval("(" & jsonString & ")")
    On Error GoTo 0
End Function

' Check if property of an object is valid eg. isValidProperty(oGeoIP, "message") meaning do "oGeoIP.message" exist
Function isValidProperty(obj, propName)
    Dim oSCtrl
    Set oSCtrl = CreateObject("MSScriptControl.ScriptControl")
    oSCtrl.AllowUI = 0
    oSCtrl.Language = "JSCript"
    oSCtrl.Reset()
    oSCtrl.AddCode("function CheckProperty(obj, propName) {return (typeof obj[propName] != ""undefined"");}")
    isValidProperty = oSCtrl.Run("CheckProperty", obj, propName)
End Function
SørenR.

There are two types of people in this world:
1) Those who can extrapolate from incomplete data

User avatar
RvdH
Senior user
Senior user
Posts: 2316
Joined: 2008-06-27 14:42
Location: The Netherlands

Re: Best way to mtach BTC address

Post by RvdH » 2022-08-06 12:23

RvdH wrote:
2022-08-04 13:13
SorenR wrote:
2022-08-04 13:00
If you want a sample base for testing your RegEx I suggest https://www.bitcoinabuse.com/reports :mrgreen:
SpamAssassin > 3.4.2 & HashBL
https://btcblack.it/
FYI, the regex used in spamassassin/HashBL is:

Code: Select all

\b(?<!=)([13][a-km-zA-HJ-NP-Z1-9]{25,34}|bc1[acdefghjklmnpqrstuvwxyz234567890]{30,62})\b

Code: Select all

if (version >= 3.004003)
  ifplugin Mail::SpamAssassin::Plugin::HashBL
    body          GB_HASHBL_BTC eval:check_hashbl_bodyre('bl.btcblack.it', 'raw/max=10/shuffle', '\b(?<!=)([13][a-km-zA-HJ-NP-Z1-9]{25,34}|bc1[acdefghjklmnpqrstuvwxyz234567890]{30,62})\b')
    priority      GB_HASHBL_BTC -100
    tflags        GB_HASHBL_BTC net publish
    reuse         GB_HASHBL_BTC
    describe      GB_HASHBL_BTC Message contains BTC address found on BTCBL
    score         GB_HASHBL_BTC 5.0 # limit
  endif
endif
btcblack hasbl is included in stock rules since some months and in KAM.cf ruleset (https://mcgrail.com/template/projects#KAM1) since more months, you only need to enable HashBL plugin to use it.
The regexp used by the rule can be checked at https://svn.apache.org/viewvc/spamassas ... iew=markup
CIDR to RegEx: d-fault.nl/cidrtoregex
DNS Lookup: d-fault.nl/dnstools
DKIM Generator: d-fault.nl/dkimgenerator
DNSBL Lookup: d-fault.nl/dnsbllookup
GEOIP Lookup: d-fault.nl/geoiplookup

Post Reply