Need help optimizing database

Forum for things that doesn't really have anything to do with hMailServer. Such as php.ini, beer, etc etc.
Post Reply
palinka
Senior user
Senior user
Posts: 1545
Joined: 2017-09-12 17:57

Need help optimizing database

Post by palinka » 2020-02-13 00:04

As part of my firewall ban project I have a table for information scraped from the firewall log. It has 2 columns: timestamp and IP. Each of those columns purposefully contain duplicate information. For example, an IP can attempt to connect multiple times and therefore be duplicated the same number of times. The firewall log timestamp only resolves to seconds (not milliseconds) so its possible to have duplicate entries for that as well.

As I approach a million rows, its starting to get very slow. I cannot think of a way (shocking, I know) to optimize the table because I really need these duplicate data for counting purposes. Ideally, I'd like to be able to split them somehow to make a unique column in order to speed up queries.

Any ideas?

User avatar
mattg
Moderator
Moderator
Posts: 20548
Joined: 2007-06-14 05:12
Location: 'The Outback' Australia

Re: Need help optimizing database

Post by mattg » 2020-02-13 00:26

unique entries and a third column for count
Just 'cause I link to a page and say little else doesn't mean I am not being nice.
https://www.hmailserver.com/documentation

palinka
Senior user
Senior user
Posts: 1545
Joined: 2017-09-12 17:57

Re: Need help optimizing database

Post by palinka » 2020-02-13 04:09

mattg wrote:
2020-02-13 00:26
unique entries and a third column for count
I'm trying to visualize this.

Table 1:
id AUTOINCREMENT PRIMARY UNIQUE
ipaddress UNIQUE
ipcount INT
last-timestamp NOT NULL
Trigger = insert into table 2 id, timestamp

Table 2:
id (matches table 1 id)(not unique or primary - duplicates allowed)
timestamp (not unique or primary - duplicates allowed)

Don't I still end up in the same place with a million rows without a unique column in table 2? Seems like what I have is slightly more efficient than adding more columns and tables that effectively do the same thing. I know I'm missing something here. I just can't wrap my head around the missing link.

tunis
Normal user
Normal user
Posts: 237
Joined: 2015-01-05 20:22
Location: Sweden

Re: Need help optimizing database

Post by tunis » 2020-02-13 09:08

All columns you use in query you should have index on.
If you use "select x where timestamp > 'y' order by id" timestamp and id should have index on them to speed up query.
HMS 5.6.8 B2494.24 on Windows Server 2019 Core VM.
HMS 5.6.8 B2494.24 on Windows Server 2016 Core VM.
HMS 5.6.7 B2425.16 on Windows Server 2012 R2 Core VM.

User avatar
jimimaseye
Moderator
Moderator
Posts: 8305
Joined: 2011-09-08 17:48

Re: Need help optimizing database

Post by jimimaseye » 2020-02-13 09:50

tunis wrote:
2020-02-13 09:08
All columns you use in query you should have index on.
If you use "select x where timestamp > 'y' order by id" timestamp and id should have index on them to speed up query.
+1

Eg

CREATE INDEX tmp_hm_fwban_timestamp ON hm_fwban (timestamp);

[Entered by mobile. Excuse my spelling.]
5.7 on test.
SpamassassinForWindows 3.4.0 spamd service
AV: Clamwin + Clamd service + sanesecurity defs : https://www.hmailserver.com/forum/viewtopic.php?f=21&t=26829

User avatar
Dravion
Senior user
Senior user
Posts: 1611
Joined: 2015-09-26 11:50
Location: Germany
Contact:

Re: Need help optimizing database

Post by Dravion » 2020-02-13 12:57

First of all, use SQL Prepared Statements.

If you insert dynamic string variables in your SQL-Statements, you force the Database Query Planer to Re-Evaluate every new SQL-Query which needs to be executed
while Prepared Statements needs to be calculated and compiled only once.

You also should avoid Table scan (full search of a table)
instead of Index Scans which are in most cases faster by creating Indexes.Avoid Foreign Keys Relations.

Reduce your PHP Result sets by only fetching the Data you need, not long lists of Data.

palinka
Senior user
Senior user
Posts: 1545
Joined: 2017-09-12 17:57

Re: Need help optimizing database

Post by palinka » 2020-02-13 13:08

Dravion wrote:
2020-02-13 12:57

You also should avoid Table scan (full search of a table)
That is definitely what I'm doing now. :(

Thanks for the ideas. When I get to my computer I'm going to try a couple of things and report back.

Post Reply