256GB RAM, 64 Core, AMD, Ubuntu 12.04 avec Percona MySQL 5.5.28.
Vous trouverez ci-dessous l'échec de l'affirmation. Nous venons d'avoir un deuxième échec d'assertion (différents "in file", position, etc.) pendant l'exécution d'un grand ensemble d'insertions.
Après le premier échec, MySQL a redémarré après un redémarrage seulement - après avoir tourné en boucle sur la même erreur après avoir essayé de récupérer.
J'ai décidé de faire un mysqlcheck avec -o pour optimiser. Puisque ce sont toutes des tables Innodb (de très grandes tables, 60+GB) cela ferait une alter table sur toutes les tables.
Au milieu de tout cela, l'échec de l'assertion ci-dessous s'est reproduit :
121115 22:30:31 InnoDB: Assertion failure in thread 140086589445888 in file btr0pcur.c line 452
InnoDB: Failing assertion: btr_page_get_prev(next_page, mtr) == buf_block_get_page_no(btr_pcur_get_block(cursor))
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
03:30:31 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/
key_buffer_size=536870912
read_buffer_size=131072
max_used_connections=404
max_threads=500
thread_count=90
connection_count=90
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1618416 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x14edeb710
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f687366ce80 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7b52ee]
/usr/sbin/mysqld(handle_fatal_signal+0x484)[0x68f024]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7f9cbb23fcb0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x7f9cbaea6425]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x17b)[0x7f9cbaea9b8b]
/usr/sbin/mysqld[0x858463]
/usr/sbin/mysqld[0x804513]
/usr/sbin/mysqld[0x808432]
/usr/sbin/mysqld[0x7db8bf]
/usr/sbin/mysqld(_Z13rr_sequentialP11READ_RECORD+0x1d)[0x755aed]
/usr/sbin/mysqld(_Z17mysql_alter_tableP3THDPcS1_P24st_ha_create_informationP10TABLE_LISTP10Alter_infojP8st_orderb+0x216b)[0x60399b]
/usr/sbin/mysqld(_Z20mysql_recreate_tableP3THDP10TABLE_LIST+0x166)[0x604bd6]
/usr/sbin/mysqld[0x647da1]
/usr/sbin/mysqld(_ZN24Optimize_table_statement7executeEP3THD+0xde)[0x64891e]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1168)[0x59b558]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x30c)[0x5a132c]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1620)[0x5a2a00]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x14f)[0x63ce6f]
/usr/sbin/mysqld(handle_one_connection+0x51)[0x63cf31]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f9cbb237e9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f9cbaf63cbd]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f6300004b60): is an invalid pointer
Connection ID (thread ID): 876
Status: NOT_KILLED
You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.
121115 22:31:07 [Note] Plugin 'FEDERATED' is disabled.
121115 22:31:07 InnoDB: The InnoDB memory heap is disabled
121115 22:31:07 InnoDB: Mutexes and rw_locks use GCC atomic builtins
.. Puis il s'est rétabli, sans redémarrage cette fois. D'après le journal, quelle en serait la cause ?
J'exécute actuellement un dump pour voir si le problème refait surface. - mysqldump s'est bien déroulé. J'essaie maintenant de restaurer l'ensemble de la base de données à partir de la vidange.
éditer : la partition de données est entièrement dans / puisque c'est un système de fichiers hébergé, par défaut, malheureusement :
Filesystem Size Used Avail Use% Mounted on
/dev/vda3 742G 445G 260G 64% /
udev 121G 4.0K 121G 1% /dev
tmpfs 49G 248K 49G 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 121G 0 121G 0% /run/shm
/dev/vda1 99M 54M 40M 58% /boot
mon.cnf :
[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock
[mysqld_safe]
socket = /var/run/mysqld/mysqld.sock
nice = 0
[mysqld]
skip-name-resolve
innodb_file_per_table
default_storage_engine=InnoDB
user = mysql
socket = /var/run/mysqld/mysqld.sock
port = 3306
basedir = /usr
datadir = /data/mysql
tmpdir = /tmp
skip-external-locking
key_buffer = 512M
max_allowed_packet = 128M
thread_stack = 192K
thread_cache_size = 64
myisam-recover = BACKUP
max_connections = 500
table_cache = 812
table_definition_cache = 812
#query_cache_limit = 4M
#query_cache_size = 512M
join_buffer_size = 512K
innodb_additional_mem_pool_size = 20M
innodb_buffer_pool_size = 196G
#innodb_file_io_threads = 4
#innodb_thread_concurrency = 12
innodb_flush_log_at_trx_commit = 1
innodb_log_buffer_size = 8M
innodb_log_file_size = 1024M
innodb_log_files_in_group = 2
innodb_max_dirty_pages_pct = 90
innodb_lock_wait_timeout = 120
log_error = /var/log/mysql/error.log
long_query_time = 5
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slowlog.log
[mysqldump]
quick
quote-names
max_allowed_packet = 16M
[mysql]
[isamchk]
key_buffer = 16M
EDIT : le mysqldump s'est déroulé sans problème. Pourquoi ces crashs se produisent-ils sur une table alter ?