CATEGORII DOCUMENTE |
Asp | Autocad | C | Dot net | Excel | Fox pro | Html | Java |
Linux | Mathcad | Photoshop | Php | Sql | Visual studio | Windows | Xml |
Tutorial - Indexes Part 2 - When to use them
Since we're getting great speed improvements, why don't we index everything we can? Before we go any further, there are some
issues to be aware of when using indexes.
As you might have noticed from the previous tutorial, when we created the indexes they have been created after the table has been
created and all of the data has been imported. This is for a reason - indexes are very time consuming to create and update.
Each time a row is added or updated, the index needs to be updated with the new information.
Let's create another data import script and check the difference.
Here's the new script:
usr/bin/perl
$count = 1;
$outputfile = 'import2.sql';
@chars = ('A 'Z', 'a 'z', 0 .. 9);
@numbers = (0 .. 9);
@single_chars = ('a' .. 'e');
$totalrecords = 5000000; # 5 million
open(OUTPUT, '> $outputfile');
print OUTPUT 'CREATE TABLE indextest2 (';
print OUTPUT 'id INT, name TEXT, number INT, letter CHAR(1)';
print OUTPUT ');';
print OUTPUT 'COPY indextest2 (id,name,number,letter) FROM stdin;n';
print OUTPUT 'CREATE INDEX index2_id ON indextest2(id);';
while ($count <= $totalrecords) (1 .. 8)]);
$randnum = join('', @numbers [map (1 .. 8)]);
$randletter = join('', @single_chars [map(1)]);
print OUTPUT
$count.'t'.$randstring.'t'.$randnum.'t'.$randletter.'n';
$count++;
print OUTPUT '.n';
print OUTPUT 'VACUUM ANALYZE indextest2;n';
close OUTPUT;
exit
The only difference is we're creating the index before importing the data - just the one line.
Let's import this data and compare the time it takes to the previous tutorial:
[pgsql@elrond pgsql]$ time psql -d indextest < import2.sql
CREATE TABLE
CREATE INDEX
VACUUM
real 10m42.827s
user 0m21.720s
sys 0m4.320s
Just by changing the order of commands and creating the index before any data is imported, we've doubled the time it takes to
import the same amount of records.
As you can see, it is definitely beneficial to create the index after the data has been imported so while the data is importing
the index doesn't get updated or referenced. Smaller tables might not have the same difference in processing time, but very large
tables of data like these the processing time is quite substantial. Of course, this only applies to tables where you are able to
bulk load data like this. For dynamic tables where rows are getting added and removed constantly like for a website, there is no
advantage to taking this approach.
Deleting a row from the table also affects the index especially if you delete a large number of rows, 1 or 2 rows will not affect
the index.
The index is cached in PostgreSQL so it doesn't have to check the number of rows and information each time the index is manipulated or accessed. To reset the cache, we need to VACUUM ANALYZE the table. It basically looks at the table to calculate the number of rows, the distribution of rows and so on for the index to use.
Politica de confidentialitate | Termeni si conditii de utilizare |
Vizualizari: 1032
Importanta:
Termeni si conditii de utilizare | Contact
© SCRIGROUP 2024 . All rights reserved