Thought this was cool: How To Test Your Upgrades – pt-upgrade

Upgrades are usually one of the biggest part of any database infrastructure maintenance. Even with enough planning something else can go bad after sending your production application to the version you’ve upgraded to. Let’s look at how one Percona Toolkit tool, pt-upgrade can help you identify what to expect and test your upgrades better which is one important step when upgrading your servers.

pt-upgrade

This tool lets you test your SELECT queries against multiple MySQL servers and reports on how each type of query performs from every server you let it test on. It can report differences with results like row counts, checksums, column count, query execution time including execution errors and warnings.

So how do you exactly test your queries on servers of multiple versions.

  1. First, ideally, each server must have identical set of data. One approach and most recommended is by having a slave upgraded to your target version and let it replicate until such time you are ready to test.  Another is populating spare servers from backup and testing with them. With the former, you can easily verify that data is consistent across the master and the slave with pt-table-checksum.
  2. Capture your production queries, we recommend either slow query log (use long_query_time = 0) or tcpdump (other types supported by pt-query-digest’s –type option is also possible, see below).
  3. Test away! Some example scenarios below.

Using slow query log:

# You can also run the slow log through pt-query-digest like the commands below
#    to limit the number of queries per fingerprint to test with.
pt-upgrade --fingerprints --run-time=1h mysqld-slow.log h=127.0.0.1,P=5091 h=127.0.0.1,P=5517

Using sources supported by pt-query-digest –type:

# tcpdump - You can replace --type tcpdump with the specific type and
#    path to the appropriate type of file you wish to input as source of queries
# --print simply returns the queries in slow log format and --sample
#    limits this to 100 queries only
# --no-report simply skips the overhead pt-query-digest had to compute
#    values for reports
pt-query-digest --print --no-report --sample 100 --type tcpdump /path/to/tcpdump.out \
   | pt-upgrade --fingerprints --run-time=1h h=mysql50 h=mysql51 h=mysql55

Few essential notes when using pt-upgrade:

  1. Make sure to cover all your query types from your log (slow log or tcpdump). If you have 5GB of log collected within 2hrs, it does not make sense to test all 5GB if your runtime will not cover all fingerprints. Passing the queries to pt-query-digest and using a sample of 100 like above should be enough sample per fingerprint and reduces the runtime (–runt-time) of the tests.
  2. Using a separate database and table with –temp-database and –temp-table is recommended, this ensures separation of the test table from your production databases and tables.
  3. Use a separate account for testing – one that has read (i.e. SELECT) to all objects within the database server and has all privileges to the specified –temp-database from #1. pt-upgrade is a read-only tool, however I can only emphasize you should try to use an account with only the necessary privileges.
  4. It is also important to note to run pt-upgrade itself on a 3rd server if the servers you are testing against are on two different machines. This helps account for the network latency and provide more accurate results.
  5. Do not test against production servers, 1) because it can incur extra load and 2) the existing production load can affect your results. Testing on EC2 (+EBS) is no exception, although you can easily spin up instances for testing, the unpredictable nature of the cloud can also affect your results.
  6. Lastly, although if you are upgrading your hardware alongside your MySQL version, you should also consider running these tests on similar hardware and OS configuration. Sometimes, a slight difference in these factors can skew your results greatly.

Here is a simple report output from pt-upgrade. As you can see, there are no errors or warnings and only differences with query times.

# Query 7: ID 0x76323E2525BA457C at byte 0 _______________________________
# Found 441 differences in 510 samples:
#   checksums       0
#   column counts   0
#   column types    0
#   query times     441
#   row counts      0
#   warning counts  0
#   warning levels  0
#   warnings        0
#            127.0.0.1:50910 127.0.0.1:55170
# Errors                   0               0
# Warnings                 0               0
# Query_time
#   sum                442ms           506ms
#   min                391us           480us
#   max                  4ms             2ms
#   avg                867us           992us
#   pct_95               1ms             1ms
#   stddev             234us           210us
#   median             881us             1ms
# row_count
#   sum                  510             510
#   min                    1               1
#   max                    1               1
#   avg                    1               1
#   pct_95                 1               1
#   stddev                 0               0
#   median                 1               1
# Fingerprint
#   select * from categories where parent = ? and site = ?

Looking at the sample above, there are microsecond differences between the query times. The most important is the average (avg) which tells in general how each fingerprint performs on the test servers. About 10% variation is tolerable and can be accounted for other MySQL overhead like network latency.

pt-upgrade provides valuable insight on how your production queries will behave on your target version, but it is only a part of the process. There is still no alternative to doing your research for incompatibilities, changes and bug fixes or new bugs introduced that can potentially break your application.

from MySQL Performance Blog: http://www.mysqlperformanceblog.com/2011/12/29/how-to-test-your-upgrades-pt-upgrade/

Thought this was cool: Actively monitoring replication connectivity with MySQL’s heartbeat

Until MySQL 5.5 the only variable used to identify a network connectivity problem between Master and Slave was slave-net-timeout. This variable specifies the number of seconds to wait for more Binary Logs events from the master before abort the connection and establish it again. With a default value of 3600 this has been a historically bad configured variable and stalled connections or high latency peaks were not detected for a long period of time or not detected at all. We needed an active master/slave connection check. And here is where replication’s heartbeat can help us.

This feature was introduce in 5.5 as another parameter to the CHANGE MASTER TO command. After you enable it, the MASTER starts to send “beat” packages (of 106 bytes) to the SLAVE every X seconds where X is a value you can define. If the network link goes down or the latency goes up for more than the time threshold, then the SLAVE IO thread will disconnect and try to connect again. This means we now measure the connection time or latency, not the time without binary log events. We’re actively checking the communication.

How can I configure replication’s heartbeat?

Is very easy to setup with negligible overhead:

mysql_slave > STOP SLAVE;
mysql_slave > CHANGE MASTER TO MASTER_HEARTBEAT_PERIOD=1;
mysql_slave > START SLAVE;

MASTER_HEATBEAT_PERIOD is a value in seconds in the range between 0 to 4294967 with resolution in milliseconds. After the loss of a beat the SLAVE IO Thread will disconnect and try to connect again. Here is the SHOW SLAVE STATUS output after an error:

mysql_slave > show slave status\G
[...]
Slave_IO_Running: Connecting
Slave_SQL_Running: Yes
[...]
Last_IO_Errno: 2003
Last_IO_Error: error reconnecting to master 'rsandbox@127.0.0.1:19972' - retry-time: 60 retries: 86400
[...]

Is interesting to note that having a 5.5 slave with replication’s heartbeat enabled and connected to a 5.1 master doesn’t break the replication. Of course, the heartbeat will not work in this case because the master doesn’t know what is a beat or how to send it :)

What status variables do I have?

The heartbeat check period time and the number of beats received.

mysql_slave > SHOW STATUS LIKE '%heartbeat%';
+---------------------------+-------+
| Variable_name | Value |
+---------------------------+-------+
| Slave_heartbeat_period | 1.000 |
| Slave_received_heartbeats | 1476 |
+---------------------------+-------+

Conclusion

If you need to know when exactly the connection between your Master/Slaves breaks then replication’s heartbeat is the easiest and fastest solution to implement.

from MySQL Performance Blog: http://www.mysqlperformanceblog.com/2011/12/29/actively-monitoring-replication-connectivity-with-mysqls-heartbeat/

Thought this was cool: Identifying the load with the help of pt-query-digest and Percona Server

Overview

Profiling, analyzing and then fixing queries is likely the most oft-repeated part of a job of a DBA and one that keeps evolving, as new features are added to the application new queries pop up that need to be analyzed and fixed. And there are not too many tools out there that can make your life easy. However, there is one such tool, pt-query-digest (from Percona Toolkit) which provides you with all the data points you need to attack the right query in the right way. But vanilla MySQL does have its limitations, it reports only a subset of stats, however if you compare that to Percona server, it reports extra stats such as information about the queries’ execution plan (which includes things like whether Query cache was used or not, if Filesort was used, whether tmp table was created in memory or on disk, if full scan was done, etc) as well as InnoDB statistics (such as IO read operations, the number of unique pages the query accessed, the length of time query waited for row locks, etc). So you can see there is a plethora of useful information reported by Percona Server. Another great thing about Percona Server is the ability to enable logging atomically, not just for new connections as in MySQL. This is very helpful for measurement as otherwise we might not catch some long running connections.

Now let’s get started.

Before We Start!

But before we start, make sure you have enabled slow query logging and set a low enough value for long_query_time. We normally use a value of long_query_time=0, because if you set it to some other value say 0.1 seconds, it will miss all queries shorter than that which well may be majority of your workload. So if you want to analyze what causes load on your server you better use a value of 0. But remember that you would only want to set it to 0, for a period of time that allows you to gather enough statistics about the queries, after that time period remember to set it back to a value greater than 0, because otherwise you can have a really large log file generated. Another thing that we normally do is set the variable log_slow_verbosity to ‘full’, this variable is available in Percona Server and allows us to log all the extra stats I mentioned above in the overview section.

So say if you were using the vanilla MySQL server, you would see an entry like this in the slow query log:

# Time: 111229  3:02:26
# User@Host: msandbox[msandbox] @ localhost []
# Query_time: 2.365434  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 655360
use test;
SET timestamp=1325145746;
select count(*) from auto_inc;

Compare that to Percona Server with log_slow_verbosity=full:

# Time: 111229  3:11:26
# User@Host: msandbox[msandbox] @ localhost []
# Thread_id: 1  Schema: test  Last_errno: 0  Killed: 0
# Query_time: 0.117904  Lock_time: 0.002886  Rows_sent: 1  Rows_examined: 655360  Rows_affected: 0  Rows_read: 655361
# Bytes_sent: 68  Tmp_tables: 0  Tmp_disk_tables: 0  Tmp_table_sizes: 0
# InnoDB_trx_id: F00
# QC_Hit: No  Full_scan: Yes  Full_join: No  Tmp_table: No  Tmp_table_on_disk: No
# Filesort: No  Filesort_on_disk: No  Merge_passes: 0
#   InnoDB_IO_r_ops: 984  InnoDB_IO_r_bytes: 16121856  InnoDB_IO_r_wait: 0.001414
#   InnoDB_rec_lock_wait: 0.000000  InnoDB_queue_wait: 0.000000
#   InnoDB_pages_distinct: 973
SET timestamp=1325146286;
select count(*) from auto_inc;

Note that logging all queries in this fashion as opposed to the general query log, enables us to have the statistics available after the query is actually executed, while no such statistics are available for queries that are logged using the general query log.

Installing pt-query-digest tool (as well as other tools from Percona Toolkit) is very easy, and is explained here at this link.

Now before we move forward, I would like to point out that the results shown in this blog post are from the queries that I have gathered from one of the Percona Server instance running on my personal Amazon micro instance.

Using pt-query-digest

Using pt-query-digest is pretty straight forward:

pt-query-digest /path/to/slow-query.log

Note that executing pt-query-digest can be pretty CPU and memory consuming, so ideally you would want to download the “slow query log” to another machine and run it there.

Analyzing pt-query-digest Output

Now let’s see what output it returns. The first part of the output is an overall summary:

# 8.1s user time, 60ms system time, 26.23M rss, 62.49M vsz
# Current date: Thu Dec 29 07:09:32 2011
# Hostname: somehost.net
# Files: slow-query.log.1
# Overall: 20.08k total, 167 unique, 16.04 QPS, 0.01x concurrency ________
# Time range: 2011-12-28 18:42:47 to 19:03:39
# Attribute          total     min     max     avg     95%  stddev  median
# ============     ======= ======= ======= ======= ======= ======= =======
# Exec time             8s     1us    44ms   403us   541us     2ms    98us
# Lock time          968ms       0    11ms    48us   119us   134us    36us
# Rows sent        105.76k       0    1000    5.39    9.83   32.69       0
# Rows examine     539.46k       0  15.65k   27.52   34.95  312.56       0
# Rows affecte       1.34k       0      65    0.07       0    1.35       0
# Rows read        105.76k       0    1000    5.39    9.83   32.69       0
# Bytes sent        46.63M      11 191.38k   2.38k   6.63k  11.24k  202.40
# Merge passes           0       0       0       0       0       0       0
# Tmp tables         1.37k       0      61    0.07       0    0.91       0
# Tmp disk tbl         490       0      10    0.02       0    0.20       0
# Tmp tbl size      72.52M       0 496.09k   3.70k       0  34.01k       0
# Query size         3.50M      13   2.00k  182.86  346.17  154.34   84.10
# InnoDB:
# IO r bytes        96.00k       0  32.00k   20.86       0  816.04       0
# IO r ops               6       0       2    0.00       0    0.05       0
# IO r wait           64ms       0    26ms    13us       0   530us       0
# pages distin      28.96k       0      48    6.29   38.53   10.74    1.96
# queue wait             0       0       0       0       0       0       0
# rec lock wai           0       0       0       0       0       0       0
# Boolean:
# Filesort       4% yes,  95% no
# Filesort on    0% yes,  99% no
# Full scan      4% yes,  95% no
# QC Hit         0% yes,  99% no
# Tmp table      4% yes,  95% no
# Tmp table on   2% yes,  97% no

It tells you that a total of 20.08k queries were captured which are actually invocations of 167 different queries. Following that there are summaries of various data points such as the total query execution time and the average query execution time, the number of tmp tables created in memory vs on-disk, percentage of queries that needed full scan, InnoDB IO stats, etc. One thing I suggest here is that, you should probably give more importance to the times/values reported in the 95% (95th percentile) column as that gives us more accurate understanding. Now, for example it is shown here that every query is reading approximately 38.53 distinct InnoDB pages (meaning 616.48K of data), however, 95% of the times InnoDB r ops is 0, which means it accesses these pages in memory. What it says though is that, if this query would run on a cold MySQL instance, then it would require reading nearly 40 pages from disk.

Let’s analyze next part of the output produced by pt-query-digest.

# Profile
# Rank Query ID           Response time Calls R/Call Apdx V/M   Item
# ==== ================== ============= ===== ====== ==== ===== ==========
#    1 0x92F3B1B361FB0E5B  4.0522 50.0%   312 0.0130 1.00  0.00 SELECT wp_options
#    2 0xE71D28F50D128F0F  0.8312 10.3%  6412 0.0001 1.00  0.00 SELECT poller_output poller_item
#    3 0x211901BF2E1C351E  0.6811  8.4%  6416 0.0001 1.00  0.00 SELECT poller_time
#    4 0xA766EE8F7AB39063  0.2805  3.5%   149 0.0019 1.00  0.00 SELECT wp_terms wp_term_taxonomy wp_term_relationships
#    5 0xA3EEB63EFBA42E9B  0.1999  2.5%    51 0.0039 1.00  0.00 SELECT UNION wp_pp_daily_summary wp_pp_hourly_summary wp_pp_hits wp_posts
#    6 0x94350EA2AB8AAC34  0.1956  2.4%    89 0.0022 1.00  0.01 UPDATE wp_options
#    7 0x7AEDF19FDD3A33F1  0.1381  1.7%   909 0.0002 1.00  0.00 SELECT wp_options
#    8 0x4C16888631FD8EDB  0.1160  1.4%     5 0.0232 1.00  0.00 SELECT film
#    9 0xCFC0642B5BBD9AC7  0.0987  1.2%    50 0.0020 1.00  0.01 SELECT UNION wp_pp_daily_summary wp_pp_hourly_summary wp_pp_hits
#   10 0x88BA308B9C0EB583  0.0905  1.1%     4 0.0226 1.00  0.01 SELECT poller_item
#   11 0xD0A520C9DB2D6AC7  0.0850  1.0%   125 0.0007 1.00  0.00 SELECT wp_links wp_term_relationships wp_term_taxonomy
#   12 0x30DA85C940E0D491  0.0835  1.0%   542 0.0002 1.00  0.00 SELECT wp_posts
#   13 0x8A52FE35D340A347  0.0767  0.9%     4 0.0192 1.00  0.00 TRUNCATE TABLE poller_time
#   14 0x3E84BF7C0C2A3005  0.0624  0.8%   272 0.0002 1.00  0.00 SELECT wp_postmeta
#   15 0xA01053DA94ED829E  0.0567  0.7%   213 0.0003 1.00  0.00 SELECT data_template_rrd data_input_fields
#   16 0xBE797E1DD5E4222F  0.0524  0.6%    79 0.0007 1.00  0.00 SELECT wp_posts
#   17 0xF8EC4434E0061E89  0.0475  0.6%    62 0.0008 1.00  0.00 SELECT wp_terms wp_term_taxonomy
#   18 0xCDFFAD848B0C1D52  0.0465  0.6%     9 0.0052 1.00  0.01 SELECT wp_posts wp_term_relationships
#   19 0x5DE709416871BF99  0.0454  0.6%   260 0.0002 1.00  0.00 DELETE poller_output
#   20 0x428A588445FE580B  0.0449  0.6%   260 0.0002 1.00  0.00 INSERT poller_output
# MISC 0xMISC              0.8137 10.0%  3853 0.0002   NS   0.0 <147 ITEMS>

The above part of the output ranks the queries and shows the top queries with largest impact – longest sum of run time which typically (not always) shows queries causing highest load on the server. As we can see here the one causing the highest load is the SELECT wp_options query, this is basically a unique way of identifying the query and simply implies that this is a SELECT query executed against the wp_options table. Another thing to note is the last line in the output the # MISC part, it tells you how much of “load” is not covered by top queries, we have 10% in MISC which means that by reviewing these top 20 queries we essentially reviewed most of the load.

Now let’s take a look at the most important part of the output:

# Query 1: 0.26 QPS, 0.00x concurrency, ID 0x92F3B1B361FB0E5B at byte 14081299
# This item is included in the report because it matches --limit.
# Scores: Apdex = 1.00 [1.0], V/M = 0.00
# Query_time sparkline: |   _^   |
# Time range: 2011-12-28 18:42:47 to 19:03:10
# Attribute    pct   total     min     max     avg     95%  stddev  median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count          1     312
# Exec time     50      4s     5ms    25ms    13ms    20ms     4ms    12ms
# Lock time      3    32ms    43us   163us   103us   131us    19us    98us
# Rows sent     59  62.41k     203     231  204.82  202.40    3.99  202.40
# Rows examine  13  73.63k     238     296  241.67  246.02   10.15  234.30
# Rows affecte   0       0       0       0       0       0       0       0
# Rows read     59  62.41k     203     231  204.82  202.40    3.99  202.40
# Bytes sent    53  24.85M  46.52k  84.36k  81.56k  83.83k   7.31k  79.83k
# Merge passes   0       0       0       0       0       0       0       0
# Tmp tables     0       0       0       0       0       0       0       0
# Tmp disk tbl   0       0       0       0       0       0       0       0
# Tmp tbl size   0       0       0       0       0       0       0       0
# Query size     0  21.63k      71      71      71      71       0      71
# InnoDB:
# IO r bytes     0       0       0       0       0       0       0       0
# IO r ops       0       0       0       0       0       0       0       0
# IO r wait      0       0       0       0       0       0       0       0
# pages distin  40  11.77k      34      44   38.62   38.53    1.87   38.53
# queue wait     0       0       0       0       0       0       0       0
# rec lock wai   0       0       0       0       0       0       0       0
# Boolean:
# Full scan    100% yes,   0% no
# String:
# Databases    wp_blog_one (264/84%), wp_blog_tw… (36/11%)... 1 more
# Hosts
# InnoDB trxID 86B40B (1/0%), 86B430 (1/0%), 86B44A (1/0%)... 309 more
# Last errno   0
# Users        wp_blog_one (264/84%), wp_blog_two (36/11%)... 1 more
# Query_time distribution
#   1us
#  10us
# 100us
#   1ms  #################
#  10ms  ################################################################
# 100ms
#    1s
#  10s+
# Tables
#    SHOW TABLE STATUS FROM `wp_blog_one ` LIKE 'wp_options'\G
#    SHOW CREATE TABLE `wp_blog_one `.`wp_options`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'\G

This is the actual part of the output dealing with analysis of the query that is taking up the longest sum of run time, query ranked #1. The first row in the table above shows the Count of number of times this query was executed. Now let’s take a look at the values in the 95% column, we can see that this query is taking up 20ms 95% of the times and is sending 202 rows and 83.83k of data per query while its also examining 246 rows for every query. Another important thing that is shown here is that every query is reading approximately 38.53 distinct InnoDB pages (meaning 616.48k of data). While you can also see that this query is doing a full scan every time its run. The “Databases” section of the output also shows the name of the databases where this query was executed. Next the “Query_time distribution” section shows how this query times mostly, which you can see majority of the time lies in the range >= 10ms and < 100ms. The “Tables” section lists the queries that you can use to gather more data about the underlying tables involved and the query execution plan used by MySQL.
The end result might be that you end up limiting the number of results returned by the query, by using a LIMIT clause or by filtering based on the option_name column, or you might even compress the values stored in the option_value column so that less data is read and sent.

Let’s analyze another query, this time query ranked #4 by pt-query-digest.

# Query 4: 0.12 QPS, 0.00x concurrency, ID 0xA766EE8F7AB39063 at byte 4001761
# This item is included in the report because it matches --limit.
# Scores: Apdex = 1.00 [1.0], V/M = 0.00
# Query_time sparkline: |  .^_   |
# Time range: 2011-12-28 18:42:47 to 19:02:57
# Attribute    pct   total     min     max     avg     95%  stddev  median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count          0     149
# Exec time      3   281ms   534us    17ms     2ms     4ms     2ms     1ms
# Lock time      1    14ms    56us   179us    92us   159us    29us    80us
# Rows sent      8   9.01k       0     216   61.90  202.40   59.80   24.84
# Rows examine   8  45.05k       0   1.05k  309.59 1012.63  299.07  124.25
# Rows affecte   0       0       0       0       0       0       0       0
# Rows read      8   9.01k       0     216   61.90  202.40   59.80   24.84
# Bytes sent     1 622.17k     694  13.27k   4.18k  11.91k   3.35k   2.06k
# Merge passes   0       0       0       0       0       0       0       0
# Tmp tables    10     149       1       1       1       1       0       1
# Tmp disk tbl  30     149       1       1       1       1       0       1
# Tmp tbl size  16  12.00M       0 287.72k  82.47k 270.35k  79.86k  33.17k
# Query size     1  45.68k     286     345  313.91  329.68   13.06  313.99
# InnoDB:
# IO r bytes     0       0       0       0       0       0       0       0
# IO r ops       0       0       0       0       0       0       0       0
# IO r wait      0       0       0       0       0       0       0       0
# pages distin   2     683       2       7    4.58    6.98    1.29    4.96
# queue wait     0       0       0       0       0       0       0       0
# rec lock wai   0       0       0       0       0       0       0       0
# Boolean:
# Filesort     100% yes,   0% no
# Tmp table    100% yes,   0% no
# Tmp table on 100% yes,   0% no
# String:
# Databases    wp_blog_one (105/70%), wp_blog_tw... (34/22%)... 1 more
# Hosts
# InnoDB trxID 86B40F (1/0%), 86B429 (1/0%), 86B434 (1/0%)... 146 more
# Last errno   0
# Users        wp_blog_one (105/70%), wp_blog_two (34/22%)... 1 more
# Query_time distribution
#   1us
#  10us
# 100us  ###################
#   1ms  ################################################################
#  10ms  #
# 100ms
#    1s
#  10s+
# Tables
#    SHOW TABLE STATUS FROM `wp_blog_one ` LIKE 'wp_terms'\G
#    SHOW CREATE TABLE `wp_blog_one `.`wp_terms`\G
#    SHOW TABLE STATUS FROM `wp_blog_one ` LIKE 'wp_term_taxonomy'\G
#    SHOW CREATE TABLE `wp_blog_one `.`wp_term_taxonomy`\G
#    SHOW TABLE STATUS FROM `wp_blog_one ` LIKE 'wp_term_relationships'\G
#    SHOW CREATE TABLE `wp_blog_one `.`wp_term_relationships`\G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT t.*, tt.*, tr.object_id FROM wp_terms AS t INNER JOIN wp_term_taxonomy AS tt ON tt.term_id = t.term_id INNER JOIN wp_term_relationships AS tr ON tr.term_taxonomy_id = tt.term_taxonomy_id WHERE tt.taxonomy IN ('category', 'post_tag', 'post_format') AND tr.object_id IN (733) ORDER BY t.name ASC\G

Let’s again take a look at the 95% column in the above output. The query execution time is 4ms, it sends 202 rows for which it has to examine 1012 rows (per every query), interesting here is that this query needs to do Filesort 100% of the times and also needs to create on-disk tmp tables every time its executed. Those are the two important things that you would probably like to fix with respect to this query. The tmp table size needed per query is 270.35k, which is not much considering the fact that tmp_tbl_size variable is set to 32M on the server, so on-disk tables are probably being created because of blob columns being accessed by the query. So a quick fix here could be to instead of selecting every column from all the tables involved in the query, probably selecting only the needed columns which could exclude the blob ones.

Conclusion

The only conclusion, I can make out is “Get yourself Percona Server, turn on log_slow_verbosity and start using pt-query-digest”, your job of identifying queries producing most load will be all the more simpler then.

from MySQL Performance Blog: http://www.mysqlperformanceblog.com/2011/12/29/identifying-the-load-with-the-help-of-pt-query-digest-and-percona-server/

Thought this was cool: Solving INFORMATION_SCHEMA slowness

Many of us find INFORMATION_SCHEMA painfully slow to work it when it comes to retrieving table meta data. Many people resort to using file system tools instead to
find for example how much space innodb tables are using and things like it. Besides being just slow accessing information_schema can often impact server performance
dramatically. The cause of majority of this slowness is not opening and closing tables, which can be solved with decent table cache size, and which is very fast for
Innodb but by the fact MySQL by default looks to refresh Innodb statistics each time table is queried from information schema.

The solution is simple, just set innodb_stats_on_metadata=0 which will prevent statistic update when you query information_schema. Most likely
you do not want it anyway. This will not make Innodb to operate without statistics at all as Innodb will still compute statistics for the table first time it opens it.

Here are some numbers from my test box:

mysql> select count(*),sum(data_length) from information_schema.tables;
+----------+------------------+
| count(*) | sum(data_length) |
+----------+------------------+
|      130 |       2856365892 |
+----------+------------------+
1 row in set (1.08 sec)

mysql> set global innodb_stats_on_metadata=0;
Query OK, 0 rows affected (0.00 sec)

mysql> select count(*),sum(data_length) from information_schema.tables;
+----------+------------------+
| count(*) | sum(data_length) |
+----------+------------------+
|      130 |       2856365892 |
+----------+------------------+
1 row in set (0.00 sec)

As you can see performance gains are huge.
Note enabling this option will not make information_schema to be stale when it comes to important stuff – data_length for example will be correctly returned by information schema as it changes.

from MySQL Performance Blog: http://www.mysqlperformanceblog.com/2011/12/23/solving-information_schema-slowness/

Thought this was cool: Optimizing InnoDB for creating 30,000 tables (and nothing else)

Once upon a time, it would have been considered madness to even attempt to create 30,000 tables in InnoDB. That time is now a memory. We have customers with a lot more tables than a mere 30,000. There have historically been no tests for anything near this many tables in the MySQL test suite.

So, in fleshing out the test cases for this and innodb_dict_size_limit I was left with the not so awesome task of making the test case run in remotely reasonable time. The test case itself is pretty simple, a simple loop in the not at all exciting mysqltest language that will create 30,000 identical tables, insert a row into each of them and then drop them.

Establishing the ground rules: I do not care about durability. This is a test case, not a production system holding important data which means I can lie, cheat and steal to get performance.

The simplest way is to use libeatmydata. This is a small shared library designed to be LD_PRELOADed that disables just about every way an application can write data safely to disk. This is perfect for running a test suite for a database server as if your machine crashes halfway through a test run, you’ll just start the test run again – you are not dealing with any important data. Obviously, you should never, ever, ever use libeatmydata with anything you care about; it is called libeat-my-data for a reason.

Without libeatmydata and using the default options for the MySQL test suite, I noticed that I was only creating about 10-15 tables every second which means we’d take a very long time to create 30,000. After a bit of time, it sped up to about 20-25 per second. Of course, with the default timeout for a MySQL test (15 minutes), we quickly (well.. in 15 minutes) hit that and the test is failed for taking too long.

With libeatmydata the test takes about 77 seconds – a huge improvement.

So how could I possibly get this test to run (even with –big-test option to mysql-test-run) in reasonable time? Well… I can set some InnoDB options! I’m going to try the obvious first: innodb-flush-method, sync-frm and innodb-flush-log-at-trx-commit. There is an undocumented option for innodb-flush-method called “nosync” that is meant to not flush data to disk. Since you could hear how much syncing to disk was going on during my test run, not syncing to disk all the time would get closer to the libeatmydata performance. I also want to disable syncing of the FRM file to disk and set log flushing to happen as infrequently as possible. With these options I started to get somewhere between 25-90 CREATE TABLE per second. This gets the test execution time down to 12 minutes, so that just escapes the timeout.

I then added the options of innodb-adaptive-checkpoint=0 and flush-neighbor-pages=0 in the hope of avoiding a bunch of background flushing (which called fsync). It didn’t help.

I noticed that there was an fsync() call when extending the data file, so I tried setting a higher innodb-autoextend-increment and a larger initial size. This also did not help.

So how fast is InnoDB under all of this? Am I hitting a fundamental limitation in InnoDB?

Well…. I went and wrote a program using HailDB – which is InnoDB as a shared library that you can call using an easy to use C API.

Writing a simple test program that creates 30,000 tables in a similar InnoDB configuration as default MySQL is pretty easy (easier than writing the mysqltest language that’s for sure). After a “I’m glad this isn’t a SSD” killer amount of fsync() activity, it took a total of 14.5 minutes. Not too bad. This is less than my initial test with MySQL, probably due to not writing and syncing FRM files. If I run the same program with libeatmydata, it only takes 15-20 seconds. Clearly it’s syncing things to disk that takes all the time.

If we make the HailDB program set flush_method to nosync and flush_log_at_trx_commit=2, the execution time is only 1 minute. This is much closer to the libeatmydata time than MySQL ever got.

With HailDB you can do more than one data dictionary operation in a single transaction. So if instead of setting flush_method and flush_log_at_trx_commit I instead group the CREATE TABLE into transactions of creating 100 tables at a time, I get an execution time of 3 minutes. This is a big difference to the original 14.5 minutes.

What’s the practical applications of all of this? Not much (unless you’re writing complex test cases) but it is clear that loading large amounts of DDL could be a lot faster than it is currently (although loading the data into tables is still going to take a while too).

from MySQL Performance Blog: http://www.mysqlperformanceblog.com/2011/12/22/optimizing-innodb-for-creating-30000-tables-and-nothing-else/

Thought this was cool: Percona Toolkit PDF manual now available

Not to be outdone by Vadim’s announcement, I also have one: Percona Toolkit’s manual is now available as PDF (requires registration). You can get it here: http://www.percona.com/software/percona-toolkit/

Thanks to Mauricio Stekl, who did all the hard work to create the PDF of the manual — and did it fast, too.

from MySQL Performance Blog: http://www.mysqlperformanceblog.com/2011/12/20/percona-toolkit-pdf-manual-now-available/

Thought this was cool: Percona Server documentation in PDF is available

This is to announce that we released the PDF version of Percona Server documentation (5.1 and 5.5 releases),
you can download it from our web site http://www.percona.com/software/percona-server/ (it will require registration).

from MySQL Performance Blog: http://www.mysqlperformanceblog.com/2011/12/20/percona-server-documentation-in-pdf-is-available/

Thought this was cool: Percona Server 5.5.18-23.0: Now with Group Commit!

Percona is glad to announce the release of Percona Server 5.5.18-23.0 on December 19th, 2011 (Downloads are available here and from the Percona Software Repositories).

Based on MySQL 5.5.18, including all the bug fixes in it, Percona Server 5.5.18-23.0 is now the current stable release in the 5.5 series. All of Percona ‘s software is open-source and free, all the details of the release can be found in the 5.5.18-23.0 milestone at Launchpad.

New Features

Percona Server now supports group commit between XtraDB and the replication binlog. Percona has imported the group commit patch from MariaDB and is making the performance improvements that group commit brings available to users of Percona Server 5.5. See the Testing the Group Commit Fix blog post for the kind of performance improments that can be expected.

Bug Fixes

  • Several crashes were reported when using the --query-cache-strip-comments feature ofPercona Server. We have fixed several causes for crashes, especially around the handling of escaped characters. Bugs fixed: #856404#705688 (Oleg Tsarev)
  • The Expand Table Import was improved not to hold the InnoDB data dictionary mutex for the full duration of the import operation. This allows queries accessing other InnoDB tables to proceed normally and not be blocked until the import completes. Bug fixed: #901775 (Alexey Kopytov)
  • As a follow-up to the already-fixed #803865, further fixes were made to the implementation of atomic operations which is used on 32-bit systems when compiled without i686+ support. There were no observed issues with the previous implementation, the fixes were made proactively for benign issues. Additionally, the Response Time Distribution, which uses those operations, was made slightly more efficient. Bug fixed: #878022 (Laurynas Biveinis)
  • An output buffer truncation check in Response Time Distribution was fixed. Bug fixed: #810272(Laurynas Biveinis)
  • The compilation warnings, produced by GCC versions up to and including 4.6, were audited and fixed. Bug fixed: #878164 (Laurynas Biveinis)
  • Testsuite stability fix for the percona_status_wait_query_cache_mutex test. Bug fixed: #878709(Oleg Tsarev)
  • A missing link was added to the Percona Server upgrade documentation. Bug fixed: #885633(Alexey Kopytov)

from MySQL Performance Blog: http://www.mysqlperformanceblog.com/2011/12/19/percona-server-5-5-18-23-0-now-with-group-commit/

Thought this was cool: Statement based replication with Stored Functions, Triggers and Events

Statement based replication writes the queries that modify data in the Binary Log to replicate them on the slave or to use it as a PITR recovery. Here we will see what is the behavior of the MySQL when it needs to log “not usual” queries like Events, Functions, Stored Procedures, Local Variables, etc. We’ll learn what problems can we have and how to avoid them.

TRIGGERS

When a statement activates a Trigger only the original query is logged not the subsequent triggered statements. If you want to maintain the consistency of your data is necessary to define the same Triggers in Master and Slave servers.

Example:

mysql> create trigger Copy_data AFTER INSERT on t FOR EACH ROW INSERT INTO t_copy VALUE(NEW.i);
mysql> insert into t VALUES(1),(2),(3);

Binary Log:

#111213 23:16:21 server id 1 end_log_pos 269 Query thread_id=3 exec_time=0 error_code=0
use test/*!*/;
SET TIMESTAMP=1323814581/*!*/;
insert into t VALUES(1),(2),(3)

This behavior can be a problem to take in account or in some other cases help us in our infrastructure. Is possible to define different Triggers in your replication servers if you need different actions for the same statements.

FUNCTIONS

Calls to functions are logged directly on the Binary Log. Therefore if you don’t have all functions created on all your servers you will break your replication and the SQL process will stop with an error.

Example:

mysql> CREATE FUNCTION this_year() RETURNS INT DETERMINISTIC RETURN YEAR(CURDATE());
mysql> insert into t VALUES(this_year());

Binary Log:

#111213 23:25:46 server id 1 end_log_pos 676 Query thread_id=5 exec_time=0 error_code=0
SET TIMESTAMP=1323815146/*!*/;
insert into t VALUES(this_year())

If you forget to create the function on your slave the replication will be broken. After executing a SHOW SLAVE STATUS you will see a message like the following one:

Last_Error: Error 'FUNCTION test.this_year does not exist' on query. Default database: 'test'. Query: 'insert into t VALUES(this_year())'

STORED PROCEDURES

The behaviour of stored procedures and functions are completely different. If our stored procedure write data to our tables the queries inside the procedure get logged and not the call to the procedure itself. So in this case you don’t need to replicate all your stored procedures in your slaves servers.

Example:

mysql> create procedure this_pyear() BEGIN INSERT INTO t VALUES(YEAR(CURDATE())); END;//
mysql> CALL this_pyear();

Binary Log:

#111213 23:33:41 server id 1 end_log_pos 2055 Query thread_id=6 exec_time=0 error_code=0
SET TIMESTAMP=1323815621/*!*/;
INSERT INTO t VALUES(YEAR(CURDATE()))

We’re going to do it a little bit more difficult:

Our procedure modifies a table that has a associated ON INSERT trigger and we use the previous function to insert the present year. Following the previous rules that we explained before is easy to imagine. The query inside the procedure gets logged with the explicit call to the Function but not the statement that the trigger executed.

We have a trigger that calls a the procedure. In this case neither the trigger, the CALL or the queries inside the procedure that modifies data are logged. So in this case you need to manually create the trigger and procedure on the slave servers.

EVENTS

When we create an Event on the Master server it gets replicated to the slave with the DISABLE ON SLAVE option. Thanks to that the Event is not executed N times for every slave we have in our infrastructure and we won’t duplicate data. The statements inside the Event are logged and not the Event itself.

Example:

mysql> create event year ON SCHEDULE AT CURRENT_TIMESTAMP + INTERVAL 1 HOUR DO INSERT INTO t VALUES(YEAR(CURDATE()));

Binary Logs:

First the event gets logged to the Binary Log:

CREATE DEFINER=`msandbox`@`localhost` event year ON SCHEDULE AT CURRENT_TIMESTAMP + INTERVAL 1 HOUR DO INSERT INTO t VALUES(YEAR(CURDATE()))

After one hour the event gets executed. The insert is logged and then re-executed on the slave server so is not necessary to have the events enabled on all servers:

#111214 0:51:54 server id 1 end_log_pos 1593 Query thread_id=18 exec_time=0 error_code=0
SET TIMESTAMP=1323820314/*!*/;
INSERT INTO t VALUES(YEAR(CURDATE()))

If later we need to promote a Slave to Master we will need to do some steps to enable all the events replicated from the master because as we saw they’re disabled by default. Here we can see the previous event after it gets replicated on the slave:

CREATE DEFINER=`msandbox`@`localhost` EVENT `year` ON SCHEDULE AT '2011-12-14 22:03:06' ON COMPLETION NOT PRESERVE DISABLE ON SLAVE DO INSERT INTO t VALUES(YEAR(CURDATE()));

These are the steps we need to do to enable on the promoted slave.

  • We should disable the event manager on Slave with SET GLOBAL event_scheduler = OFF;
  • ENABLE every event with “ALTER EVENT event_name ENABLE”
  • Enable again the event_scheduler.

To demote a server we need to follow the same previous steps but in this case we alter every event adding DISABLE ON SLAVE. ALTER EVENT even_name DISABLE ON SLAVE.

LOCAL VARIABLES

Local Variables are logged as an extra events before the statement itself. So all used variables are replicated on the slave.

Example:

mysql> SELECT YEAR(CURDATE()) INTO @this_year;
mysql> insert into t VALUES(@this_year)

Binary Log:

#111213 23:58:11 server id 1 end_log_pos 457 User_var
SET @`this_year`:=2011/*!*/;
# at 457
#111213 23:58:11 server id 1 end_log_pos 552 Query thread_id=13 exec_time=0 error_code=0
SET TIMESTAMP=1323817091/*!*/;
insert into t VALUES(@this_year)

Is important to add a note here. There is a bug related with this described on https://bugs.launchpad.net/percona-server/+bug/860910. In a master-master setup a master can show a wrong ‘SHOW SLAVE STATUS’ output when using SET user-variables and then using it to perform writes. The issue is fixed only in Percona server 5.5.17-22.1.

AUTO INCREMENTAL VALUES

In order to have the same auto incremental values on master and slaves the actual used auto incremental value is logged as an extra event just before the statement. MySQL uses the INSERT_ID variable to store that value.

Binary Log:

#111214 0:03:04 server id 1 end_log_pos 811 Intvar
SET INSERT_ID=1/*!*/;
# at 811
#111214 0:03:04 server id 1 end_log_pos 905 Query thread_id=14 exec_time=0 error_code=0
SET TIMESTAMP=1323817384/*!*/;
INSERT INTO y (value) VALUES(1)

If we use “LAST_INSERT_ID” on our master the value of that function is logged as a variable on the binary log so the slave can use the same value.

Binary Log:

#111214 0:04:50 server id 1 end_log_pos 1245 Intvar
SET LAST_INSERT_ID=2/*!*/;
#111214 0:04:50 server id 1 end_log_pos 1273 Intvar
SET INSERT_ID=3/*!*/;
#111214 0:04:50 server id 1 end_log_pos 1382 Query thread_id=15 exec_time=0 error_code=0
SET TIMESTAMP=1323817490/*!*/;
INSERT INTO y (value) VALUES(LAST_INSERT_ID())

That was for a insert with only one value. What is the behavior if we insert multiple values and we have gaps?

Example:
Imagine the table has this values on the auto incremental column: 1,2,3,5

Then we execute the following:

insert into y (i,value) VALUES(4,1),(NULL,1),(NULL,1);

Binary Log:

The value 4 is inserted with the INSERT statement and the INSERT_ID is the next auto incremental value:

#111214 21:54:06 server id 1 end_log_pos 203 Intvar
SET INSERT_ID=6/*!*/;
#111214 21:54:06 server id 1 end_log_pos 319 Query thread_id=12 exec_time=0
SET TIMESTAMP=1323896046/*!*/;
insert into y (i,value) VALUES(4,1),(NULL,1),(NULL,1)

Conclusion

As we can see there are a lot of things to take in account when we are working with statement based replication. Knowing in advances how MySQL works internally can help us to improve our replication availability and data.

from MySQL Performance Blog: http://www.mysqlperformanceblog.com/2011/12/16/statement-based-replication-with-stored-functions-triggers-and-events/