GTID有哪些独特的复制姿势

##==========================================##

本文为DBA+社会群众体育的投稿作品:


##==========================================##

Master的crash safe

前边一直在讲crash safe slave,Master的crash safe相符重要。
要想Master保持crash
safe须求按上面包车型的士参数实行安装,不然不但会屏弃事务,gtid_executed还大概和事实上的innodb存款和储蓄引擎中的数据不均等。

sync_binlog                    = 1innodb_flush_log_at_trx_commit = 1 

在Master配置为”双1″的情景下,Master
crash后,若无发生failover,能够世袭作为Master。
要是产生了failover,可以检查旧Master和新Master上由旧Master施行的事情群集是不是意气风发致。

 show master status 

倘诺相像,能够按MASTE奥迪Q3_AUTO_POSITION =
1的情势将旧Master作为Slave和新Master创建复制关系。否则,考虑做政工补偿或从新Master上拉取备份进行苏醒。

在Master配置不是”双1″的景观下,在Master
crash后由于不便正确明白旧Master上到底实施了什么样专门的学业,安全的做法是施行主备切换,并从新Master上拉取备份,把旧Master作为新Master的Slave进行理并答复原。

本文为DBA+社会群众体育的投稿文章:
与MySQL古板复制相比较,GTID有如何…

利用GTID情势切换的日记:

何以发生GTID

GTID的改变受gtid_next控制。
在Master上,gtid_next是暗许的AUTOMATIC,即在每一回事务提交时自动生成新的GTID。它今后时此刻已施行的GTID会集(即gtid_executed)中,找二个大于0的未利用的最小值作为下个业务GTID。同有时候在binlog的莫过于的翻新工作事件前面插入一条set
gtid_next事件。

以下是一条insert语句生成的binlog记录

mysql> use `test`Database changedmysql> insert into tbx1 values(1);Query OK, 1 row affected (0.01 sec)mysql> show binlog events IN 'binlog.000015';+---------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+| Log_name      | Pos | Event_type     | Server_id | End_log_pos | Info                                                              |+---------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+...| binlog.000015 | 707 | Gtid           |         1 |         755 | SET @@SESSION.GTID_NEXT= 'e10c75be-5c1b-11e6-ab7c-000c296078ae:9' || binlog.000015 | 755 | Query          |         1 |         834 | BEGIN                                                             || binlog.000015 | 834 | Query          |         1 |         934 | use `test`; insert into tbx1 values(1)                            || binlog.000015 | 934 | Xid            |         1 |         965 | COMMIT /* xid=20 */                                               | 

在Slave上回看主库的binlog时,先进行set gtid_next
…,然后再施行真正的insert语句,确认保证在主和备上那条insert对应于相近的GTID。

诚如意况下,GTID会集是连接的,但使用三十六线程复制(MTS卡塔尔以至因此gtid_next实行人工干预时会招致gtid空洞。比如下边那样:

mysql> show master status;+---------------+----------+--------------+------------------+------------------------------------------+| File          | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                        |+---------------+----------+--------------+------------------+------------------------------------------+| binlog.000015 |      965 |              |                  | e10c75be-5c1b-11e6-ab7c-000c296078ae:1-9 |+---------------+----------+--------------+------------------+------------------------------------------+1 row in set (0.00 sec)mysql> set gtid_next='e10c75be-5c1b-11e6-ab7c-000c296078ae:12';Query OK, 0 rows affected (0.00 sec)mysql> begin;Query OK, 0 rows affected (0.00 sec)mysql> commit;Query OK, 0 rows affected (0.00 sec)mysql> set gtid_next='AUTOMATIC';Query OK, 0 rows affected (0.00 sec)mysql> show master status;+---------------+----------+--------------+------------------+---------------------------------------------+| File          | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                           |+---------------+----------+--------------+------------------+---------------------------------------------+| binlog.000015 |     1158 |              |                  | e10c75be-5c1b-11e6-ab7c-000c296078ae:1-9:12 |+---------------+----------+--------------+------------------+---------------------------------------------+1 row in set (0.00 sec) 

继续试行事务,MySQL会分配八个纤维的未选拔GTID,也便是从现身空洞之处分配GTID,最后会把空洞填上。

mysql> insert into tbx1 values(1);Query OK, 1 row affected (0.01 sec)mysql> show master status;+---------------+----------+--------------+------------------+----------------------------------------------+| File          | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                            |+---------------+----------+--------------+------------------+----------------------------------------------+| binlog.000015 |     1416 |              |                  | e10c75be-5c1b-11e6-ab7c-000c296078ae:1-10:12 |+---------------+----------+--------------+------------------+----------------------------------------------+1 row in set (0.00 sec) 

这表示严俊来讲大家即无法借使GTID会集是连连的,也无法假定GTID序号大的专门的学问在GTID序号小的职业之后试行,事务的次第应由业务记录在binlog中的前后相继顺序决定。

##==========================================##
在非GTID情势下,会先举行Phase
3.1等第,从有着最新BINLOG的从库上获得差别日志,再拓宽Phase
3.2阶段,尝试从原Master服务器上拿到最新BINLOG。
 www.350.vip 1

与MySQL古板复制比较,GTID有怎么着特殊的复制姿势?

有上诉近似难题时,将从库进步为主库并使用master_auto_position=1来安顿复制,复制会因为新主库不恐怕提供丰硕BINLOG事件而停业。

哪些查看GTID

能够经过MySQL的多少个变量查占星关的GTID新闻。

  • gtid_executed
    在脚下实例上推行过的GTID集结;
    实际上满含了有着记录到binlog中的事务。所以,设置set
    sql_log_bin=0后进行的思想政治工作不会生成binlog
    事件,也不会被记录到gtid_executed中。施行RESET
    MASTEEscort能够将该变量置空。

  • gtid_purged
    binlog不容许永恒驻留在服务上,须要准期开展清理(通过expire_logs_days能够垄断定时清理距离State of Qatar,不然势必它会把磁盘用尽。gtid_purged用于记录已经被消除了的binlog事务集结,它是gtid_executed的子集。只有gtid_executed为空时技巧手动设置该变量,那个时候会同一时间更新gtid_executed为和gtid_purged相似的值。gtid_executed为空意味着要么以前未曾运行过基于GTID的复制,要么实行过RESET
    MASTETiggo。执行RESET
    MASTERAV4时雷同也会把gtid_purged置空,即始终维持gtid_purged是gtid_executed的子集。

  • gtid_next
    会话级变量,指示怎样发生下叁个GTID。恐怕的取值如下:

    • AUTOMATIC:
      自动生成下叁个GTID,实现上是分配叁个脚下实例上并未有执行过的序号最小的GTID。
    • ANONYMOUS:
      设置后实行工作不会发出GTID。
    • 显式钦定的GTID:
      能够内定自便形式合法的GTID值,但无法是当前gtid_executed中的已经富含的GTID,不然,后一次奉行事务时会报错。

这个变量能够透过show命令查看,比方

mysql> show global variables like 'gtid%';+----------------------+------------------------------------------+| Variable_name        | Value                                    |+----------------------+------------------------------------------+| gtid_deployment_step | OFF                                      || gtid_executed        | e10c75be-5c1b-11e6-ab7c-000c296078ae:1-6 || gtid_mode            | ON                                       || gtid_owned           |                                          || gtid_purged          |                                          |+----------------------+------------------------------------------+5 rows in set (0.02 sec)mysql> show  variables like 'gtid_next';+---------------+-----------+| Variable_name | Value     |+---------------+-----------+| gtid_next     | AUTOMATIC |+---------------+-----------+1 row in set (0.00 sec) 

##==========================================##
依附GTID形式展开故障切换时,无论原Master节点OS是还是不是平时,都不会尝试从原Master节点读取BINLOG举行日志补偿。
基于GTID形式的MHA协理在复制拓扑中接收BINLOG
Server来张开日志补偿,而非GTID格局的MHA会忽视BINLOG Server。
建议在依据GTID情势的群聚焦,不选取MHA进行”手动主从切换”,该操作或许会促成原主库上有的BINLOG错过。

经过mysqldump举行备份

由此mysqldump做三个全量备份

[[email protected] ~]# mysqldump --all-databases --single-transaction --routines --events --host=127.0.0.1 --port=3306 --user=root > dump.sql 

变迁的dump.sql文件里包涵了安装gtid_purged的语句

dump.sql:

SET @MYSQLDUMP_TEMP_LOG_BIN = @@SESSION.SQL_LOG_BIN;SET @@SESSION.SQL_LOG_BIN= 0;...SET @@GLOBAL.GTID_PURGED='e10c75be-5c1b-11e6-ab7c-000c296078ae:1-10';...SET @@SESSION.SQL_LOG_BIN = @MYSQLDUMP_TEMP_LOG_BIN; 

苏醒数据前须求先经过reset master清空gtid_executed变量

[[email protected] ~]# mysql -h127.1 -e 'reset master'[[email protected] ~]# mysql -h127.1 <dump.sql 

要不推行设置GTID_PU兰德GL450GED的SQL时会报下边包车型客车大谬不然

ERROR 1840 (HY000) at line 24: @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty. 

那个时候苏醒出的MySQL实例的GTID_EXECUTED和备份时点生龙活虎致

mysql> show master status;+---------------+----------+--------------+------------------+-------------------------------------------+| File          | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                         |+---------------+----------+--------------+------------------+-------------------------------------------+| binlog.000002 |      191 |              |                  | e10c75be-5c1b-11e6-ab7c-000c296078ae:1-10 |+---------------+----------+--------------+------------------+-------------------------------------------+1 row in set (0.00 sec) 

鉴于复原出的MySQL实例已经被设置的正确的GTID_EXECUTED,以master_auto_postion
= 1的章程CHANGE MASTE本田UR-V到原本的主节点就可以开端复制。

CHANGE MASTER TO MASTER_HOST='node1', MASTER_USER='repl', MASTER_PASSWORD='repl', MASTER_AUTO_POSITION = 1 

要是不希望备份文件中变化设置GTID_PURGED的SQL,可以给mysqldump传入–set-gtid-purged=OFF关闭。

1、对从库进行间接授权,导致从库比主库具有越来越多BINLOG,但该Binlog因各样缘由被Purged掉

GTID与MHA

MHA是被周边采纳MySQL HA组件,MHA 0.56从今现在协助基于GTID的复制。
MHA在failover时会自动判别是或不是是GTID based
failover,要求满足下边3个尺码即为GTID based failover

  • 抱有节点gtid_mode=1
  • 具备节点Executed_Gtid_Set不为空
  • 最少五个节点Auto_Position=1

和早先的基于binlog文件地方的复制相比,基于GTID复制下,MHA在故障切换时的更换重视如下:

  • 基于binlog文件地方的复制

    • 在Master宕机后会尝试从Master上拷贝binlog日志举行填空   
    • 设若候选Master不具备最新的relay log,会从具备新型relay
      log的Slave上扭转差别的binlog传送到候选Master并实践补给  
    • 新Master的日记补偿到位后,相符应用接受差别binlog的主意将其他Slave和新Master同步后再change
      master到新Master  
  • 基于GTID的复制  

    • 风华正茂旦候选Master不具有最新的relay
      log,让候选Master连上具备新型relay log的Salve举办添补。  
    • 品味从binlog server上拉取缺点和失误的binlog并利用
    • 新Master的数据同步到新型后,让别的的Slave连上新Master并等待数据变成少年老成道。并且可以给masterha_master_switch传入–wait_until_gtid_in_sync=1参数使其不等别的Slave实现数据同步,以加速切换速度。

在GTID格局下MHA不会尝试从旧Master上拷贝binlog日志举行补缺,所以在MySQL进度crash而OS还是健康的景观下,应尽大概不要做主备切换而是原地重启MySQL,除非有任何能有限支持切换后不丢数据的艺术。

在GTID形式下MHA补助在复制拓扑中增添三个或五个binlog
server起到日志补偿的效能,非GTID形式下正是配置了binlog
server也会被MHA忽视。

日志补偿可以说是MHA中最复杂也最精髓的有的,有了GTID后故障切换变得更简约了,不再须求原来复杂的binlog日志解析和增补。所以Oracle官方推出了只帮衬GTID复制的切换工具mysqlfailover,在GTID的赞助下,大家有越来越多可信赖的HA工具得以选拔。

##==========================================##
在依靠GTID方式下,不展会开Phase
3.2级别,即尝试从原Master服务器中赢得最新BINLOG。

GTID长什么样

依照官方文书档案定义,GTID由source_id加transaction_id构成。

GTID = source_id:transaction_id 

上面的source_id提示发起事务的MySQL实例,值为该实例的server_uuid。server_uuid由MySQL在首先次运转时自动生成并被长久化到auto.cnf文件里,transaction_id是MySQL实例上推行的事情序号,从1发轫依次增加。
比方:

e6954592-8dba-11e6-af0e-fa163e1cf111:1 

意气风发组三回九转的工作能够用’-‘连接的作业序号范围表示。比如

e6954592-8dba-11e6-af0e-fa163e1cf111:1-5 

更相似的事态是GTID的集纳。GTID集合能够满含来自八个source_id的作业,它们中间用逗号分隔;如若来自同少年老成source_id的事情序号有八个范围区间,各组范围以内用冒号分隔,举例:

e6954592-8dba-11e6-af0e-fa163e1cf111:1-5:11-18,e6954592-8dba-11e6-af0e-fa163e1cf3f2:1-27 

即,GTID集结拥犹如下的花样定义:

gtid_set:    uuid_set [, uuid_set] ...    | ''uuid_set:    uuid:interval[:interval]...uuid:    hhhhhhhh-hhhh-hhhh-hhhh-hhhhhhhhhhhhh:    [0-9|A-F]interval:    n[-n]    (n >= 1) 

使用非GTID形式切换的日志

怎样在非”双1″下保障crash safe slave

只若是MySQL
5.7得以关闭log_slave_updates,那样MySQL会将已实行的GTIDs实时记下到系统表mysql.gtid_executed中,mysql.gtid_executed是和客商业务一齐提交的,因而能够保证和实际的数额风姿罗曼蒂克致。

log_slave_updates              = OFFrelay_log_recovery             = ON 

假使是MySQL 5.6能够使用如下变化的措施。

遵循基于binlog文件复制时crash safe
slave的渴求安装relay_log_info_repository = TABLE

relay_log_info_repository      = TABLErelay_log_recovery             = ON 

在Slave
crash后,根据relay_log_info_repository设置相应的gitd_purged再展开复制,步骤如下。

  1. 开发银行mysql,但不开启复制

    mysqld --skip-slave-start 
    
  2. 在Slave上改进为基于binlog文件地点的复制

    change master to MASTER_AUTO_POSITION = 0 
    
  3. www.350.vip ,启动slave IO线程

    start slave io_thread 
    

    这里不能够开发银行SQL线程,如若选拔到的GTID已经在Slave的gtid_executed里了,会被Slave
    skip掉。

  4. 检查binlog传输的始发地方(即Retrieved_Gtid_Set的值)

    show slave statusG 
    

    意气风发经输出的Retrieved_Gtid_Set值为e10c75be-5c1b-11e6-ab7c-000c296078ae:7-10

  5. 在Master上检查gtid_executed

    show master status 
    

    只要输出的Executed_Gtid_Set值为e10c75be-5c1b-11e6-ab7c-000c296078ae:1-10

  6. 在Slave上设置gitd_purged为binlog传输地方的日前的GTID的会集

    reset master;set global gitd_purged='e10c75be-5c1b-11e6-ab7c-000c296078ae:1-6'; 
    
  7. 修改回auto position的复制

    change master to MASTER_AUTO_POSITION = 1 
    
  8. 启动slave SQL线程

    start slave sql_thread 
    

只是,这种变化的主意不符合四线程复制。因为三十二线程复制或者爆发gtid
gap和Gap-free low-watermark
position,那会招致Salve上海重型机器厂复apply已经apply过的event。后果就是多少不相像恐怕复制中断,除非设置binlog格式为row情势并且slave_exec_mode=IDEMPOTENT,slave_exec_mode=IDEMPOTENT允许Slave重放binlog时马虎重复键和找不到键的谬误,使得binlog回看具备幂等性,但那也表示风度翩翩旦确实现身了主备数据不相像也会被它忽视。

Sun Jul  8 23:35:21 2018 - [info] MHA::MasterMonitor version 0.56.
Sun Jul  8 23:35:21 2018 - [info] GTID failover mode = 1
Sun Jul  8 23:35:21 2018 - [info] Dead Servers:
Sun Jul  8 23:35:21 2018 - [info] Alive Servers:
Sun Jul  8 23:35:21 2018 - [info]   10.0.203.104(10.0.203.104:3358)
Sun Jul  8 23:35:21 2018 - [info]   10.0.203.109(10.0.203.109:3358)
Sun Jul  8 23:35:21 2018 - [info]   10.0.203.117(10.0.203.117:3358)
Sun Jul  8 23:35:21 2018 - [info] Alive Slaves:
Sun Jul  8 23:35:21 2018 - [info]   10.0.203.104(10.0.203.104:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jul  8 23:35:21 2018 - [info]     GTID ON
Sun Jul  8 23:35:21 2018 - [info]     Replicating from 10.0.203.109(10.0.203.109:3358)
Sun Jul  8 23:35:21 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Jul  8 23:35:21 2018 - [info]   10.0.203.117(10.0.203.117:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jul  8 23:35:21 2018 - [info]     GTID ON
Sun Jul  8 23:35:21 2018 - [info]     Replicating from 10.0.203.109(10.0.203.109:3358)
Sun Jul  8 23:35:21 2018 - [info] Current Alive Master: 10.0.203.109(10.0.203.109:3358)
Sun Jul  8 23:35:21 2018 - [info] Checking slave configurations..
Sun Jul  8 23:35:21 2018 - [info]  read_only=1 is not set on slave 10.0.203.104(10.0.203.104:3358).
Sun Jul  8 23:35:21 2018 - [info]  read_only=1 is not set on slave 10.0.203.117(10.0.203.117:3358).
Sun Jul  8 23:35:21 2018 - [info] Checking replication filtering settings..
Sun Jul  8 23:35:21 2018 - [info]  binlog_do_db= , binlog_ignore_db= 
Sun Jul  8 23:35:21 2018 - [info]  Replication filtering check ok.
Sun Jul  8 23:35:21 2018 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Sun Jul  8 23:35:21 2018 - [info] Checking SSH publickey authentication settings on the current master..
Sun Jul  8 23:35:22 2018 - [info] HealthCheck: SSH to 10.0.203.109 is reachable.
Sun Jul  8 23:35:22 2018 - [info] 
10.0.203.109(10.0.203.109:3358) (current master)
 +--10.0.203.104(10.0.203.104:3358)
 +--10.0.203.117(10.0.203.117:3358)

Sun Jul  8 23:35:22 2018 - [warning] master_ip_failover_script is not defined.
Sun Jul  8 23:35:22 2018 - [warning] shutdown_script is not defined.
Sun Jul  8 23:35:22 2018 - [info] Set master ping interval 1 seconds.
Sun Jul  8 23:35:22 2018 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Sun Jul  8 23:35:22 2018 - [info] Starting ping health check on 10.0.203.109(10.0.203.109:3358)..
Sun Jul  8 23:35:22 2018 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
Sun Jul  8 23:35:58 2018 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Sun Jul  8 23:35:58 2018 - [info] Executing SSH check script: exit 0
Sun Jul  8 23:35:58 2018 - [info] HealthCheck: SSH to 10.0.203.109 is reachable.
Sun Jul  8 23:35:59 2018 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Sun Jul  8 23:35:59 2018 - [warning] Connection failed 2 time(s)..
Sun Jul  8 23:36:00 2018 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Sun Jul  8 23:36:00 2018 - [warning] Connection failed 3 time(s)..
Sun Jul  8 23:36:01 2018 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Sun Jul  8 23:36:01 2018 - [warning] Connection failed 4 time(s)..
Sun Jul  8 23:36:01 2018 - [warning] Master is not reachable from health checker!
Sun Jul  8 23:36:01 2018 - [warning] Master 10.0.203.109(10.0.203.109:3358) is not reachable!
Sun Jul  8 23:36:01 2018 - [warning] SSH is reachable.
Sun Jul  8 23:36:01 2018 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status..
Sun Jul  8 23:36:01 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Jul  8 23:36:01 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sun Jul  8 23:36:01 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sun Jul  8 23:36:01 2018 - [info] GTID failover mode = 1
Sun Jul  8 23:36:01 2018 - [info] Dead Servers:
Sun Jul  8 23:36:01 2018 - [info]   10.0.203.109(10.0.203.109:3358)
Sun Jul  8 23:36:01 2018 - [info] Alive Servers:
Sun Jul  8 23:36:01 2018 - [info]   10.0.203.104(10.0.203.104:3358)
Sun Jul  8 23:36:01 2018 - [info]   10.0.203.117(10.0.203.117:3358)
Sun Jul  8 23:36:01 2018 - [info] Alive Slaves:
Sun Jul  8 23:36:01 2018 - [info]   10.0.203.104(10.0.203.104:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jul  8 23:36:01 2018 - [info]     GTID ON
Sun Jul  8 23:36:01 2018 - [info]     Replicating from 10.0.203.109(10.0.203.109:3358)
Sun Jul  8 23:36:01 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Jul  8 23:36:01 2018 - [info]   10.0.203.117(10.0.203.117:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jul  8 23:36:01 2018 - [info]     GTID ON
Sun Jul  8 23:36:01 2018 - [info]     Replicating from 10.0.203.109(10.0.203.109:3358)
Sun Jul  8 23:36:01 2018 - [info] Checking slave configurations..
Sun Jul  8 23:36:01 2018 - [info]  read_only=1 is not set on slave 10.0.203.104(10.0.203.104:3358).
Sun Jul  8 23:36:01 2018 - [info]  read_only=1 is not set on slave 10.0.203.117(10.0.203.117:3358).
Sun Jul  8 23:36:01 2018 - [info] Checking replication filtering settings..
Sun Jul  8 23:36:01 2018 - [info]  Replication filtering check ok.
Sun Jul  8 23:36:01 2018 - [info] Master is down!
Sun Jul  8 23:36:01 2018 - [info] Terminating monitoring script.
Sun Jul  8 23:36:01 2018 - [info] Got exit code 20 (Master dead).
Sun Jul  8 23:36:01 2018 - [info] MHA::MasterFailover version 0.56.
Sun Jul  8 23:36:01 2018 - [info] Starting master failover.
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] * Phase 1: Configuration Check Phase..
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] GTID failover mode = 1
Sun Jul  8 23:36:01 2018 - [info] Dead Servers:
Sun Jul  8 23:36:01 2018 - [info]   10.0.203.109(10.0.203.109:3358)
Sun Jul  8 23:36:01 2018 - [info] Checking master reachability via MySQL(double check)...
Sun Jul  8 23:36:01 2018 - [info]  ok.
Sun Jul  8 23:36:01 2018 - [info] Alive Servers:
Sun Jul  8 23:36:01 2018 - [info]   10.0.203.104(10.0.203.104:3358)
Sun Jul  8 23:36:01 2018 - [info]   10.0.203.117(10.0.203.117:3358)
Sun Jul  8 23:36:01 2018 - [info] Alive Slaves:
Sun Jul  8 23:36:01 2018 - [info]   10.0.203.104(10.0.203.104:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jul  8 23:36:01 2018 - [info]     GTID ON
Sun Jul  8 23:36:01 2018 - [info]     Replicating from 10.0.203.109(10.0.203.109:3358)
Sun Jul  8 23:36:01 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Jul  8 23:36:01 2018 - [info]   10.0.203.117(10.0.203.117:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jul  8 23:36:01 2018 - [info]     GTID ON
Sun Jul  8 23:36:01 2018 - [info]     Replicating from 10.0.203.109(10.0.203.109:3358)
Sun Jul  8 23:36:01 2018 - [info] Starting GTID based failover.
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] ** Phase 1: Configuration Check Phase completed.
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] * Phase 2: Dead Master Shutdown Phase..
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] Forcing shutdown so that applications never connect to the current master..
Sun Jul  8 23:36:01 2018 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master IP address.
Sun Jul  8 23:36:01 2018 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Sun Jul  8 23:36:01 2018 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] * Phase 3: Master Recovery Phase..
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] The latest binary log file/position on all slaves is mysql-bin.000008:6689
Sun Jul  8 23:36:01 2018 - [info] Retrieved Gtid Set: 541e0f07-8047-11e8-8434-0800270b00d2:49-69
Sun Jul  8 23:36:01 2018 - [info] Latest slaves (Slaves that received relay log files to the latest):
Sun Jul  8 23:36:01 2018 - [info]   10.0.203.104(10.0.203.104:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jul  8 23:36:01 2018 - [info]     GTID ON
Sun Jul  8 23:36:01 2018 - [info]     Replicating from 10.0.203.109(10.0.203.109:3358)
Sun Jul  8 23:36:01 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Jul  8 23:36:01 2018 - [info]   10.0.203.117(10.0.203.117:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jul  8 23:36:01 2018 - [info]     GTID ON
Sun Jul  8 23:36:01 2018 - [info]     Replicating from 10.0.203.109(10.0.203.109:3358)
Sun Jul  8 23:36:01 2018 - [info] The oldest binary log file/position on all slaves is mysql-bin.000008:6689
Sun Jul  8 23:36:01 2018 - [info] Retrieved Gtid Set: 541e0f07-8047-11e8-8434-0800270b00d2:49-69
Sun Jul  8 23:36:01 2018 - [info] Oldest slaves:
Sun Jul  8 23:36:01 2018 - [info]   10.0.203.104(10.0.203.104:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jul  8 23:36:01 2018 - [info]     GTID ON
Sun Jul  8 23:36:01 2018 - [info]     Replicating from 10.0.203.109(10.0.203.109:3358)
Sun Jul  8 23:36:01 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Jul  8 23:36:01 2018 - [info]   10.0.203.117(10.0.203.117:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jul  8 23:36:01 2018 - [info]     GTID ON
Sun Jul  8 23:36:01 2018 - [info]     Replicating from 10.0.203.109(10.0.203.109:3358)
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] * Phase 3.3: Determining New Master Phase..
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] Searching new master from slaves..
Sun Jul  8 23:36:01 2018 - [info]  Candidate masters from the configuration file:
Sun Jul  8 23:36:01 2018 - [info]   10.0.203.104(10.0.203.104:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jul  8 23:36:01 2018 - [info]     GTID ON
Sun Jul  8 23:36:01 2018 - [info]     Replicating from 10.0.203.109(10.0.203.109:3358)
Sun Jul  8 23:36:01 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Jul  8 23:36:01 2018 - [info]  Non-candidate masters:
Sun Jul  8 23:36:01 2018 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Sun Jul  8 23:36:01 2018 - [info] New master is 10.0.203.104(10.0.203.104:3358)
Sun Jul  8 23:36:01 2018 - [info] Starting master failover..
Sun Jul  8 23:36:01 2018 - [info] 
From:
10.0.203.109(10.0.203.109:3358) (current master)
 +--10.0.203.104(10.0.203.104:3358)
 +--10.0.203.117(10.0.203.117:3358)

To:
10.0.203.104(10.0.203.104:3358) (new master)
 +--10.0.203.117(10.0.203.117:3358)
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] * Phase 3.3: New Master Recovery Phase..
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info]  Waiting all logs to be applied.. 
Sun Jul  8 23:36:01 2018 - [info]   done.
Sun Jul  8 23:36:01 2018 - [info] Getting new master's binlog name and position..
Sun Jul  8 23:36:01 2018 - [info]  mysql-bin.000006:77499
Sun Jul  8 23:36:01 2018 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='10.0.203.104', MASTER_PORT=3358, MASTER_AUTO_POSITION=1, MASTER_USER='replicater', MASTER_PASSWORD='xxx';
Sun Jul  8 23:36:01 2018 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000006, 77499, 41d8a420-8047-11e8-8580-080027e837eb:1-92,
541e0f07-8047-11e8-8434-0800270b00d2:1-69
Sun Jul  8 23:36:01 2018 - [warning] master_ip_failover_script is not set. Skipping taking over new master IP address.
Sun Jul  8 23:36:01 2018 - [info] ** Finished master recovery successfully.
Sun Jul  8 23:36:01 2018 - [info] * Phase 3: Master Recovery Phase completed.
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] * Phase 4: Slaves Recovery Phase..
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] * Phase 4.1: Starting Slaves in parallel..
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] -- Slave recovery on host 10.0.203.117(10.0.203.117:3358) started, pid: 5680. Check tmp log /var/log/masterha/app1/10.0.203.117_3358_20180708233601.log if it takes time..
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] Log messages from 10.0.203.117 ...
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info]  Resetting slave 10.0.203.117(10.0.203.117:3358) and starting replication from the new master 10.0.203.104(10.0.203.104:3358)..
Sun Jul  8 23:36:01 2018 - [info]  Executed CHANGE MASTER.
Sun Jul  8 23:36:01 2018 - [info]  Slave started.
Sun Jul  8 23:36:01 2018 - [info]  gtid_wait(41d8a420-8047-11e8-8580-080027e837eb:1-92,
541e0f07-8047-11e8-8434-0800270b00d2:1-69) completed on 10.0.203.117(10.0.203.117:3358). Executed 0 events.
Sun Jul  8 23:36:01 2018 - [info] End of log messages from 10.0.203.117.
Sun Jul  8 23:36:01 2018 - [info] -- Slave on host 10.0.203.117(10.0.203.117:3358) started.
Sun Jul  8 23:36:01 2018 - [info] All new slave servers recovered successfully.
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] * Phase 5: New master cleanup phase..
Sun Jul  8 23:36:01 2018 - [info] 
Sun Jul  8 23:36:01 2018 - [info] Resetting slave info on the new master..
Sun Jul  8 23:36:01 2018 - [info]  10.0.203.104: Resetting slave info succeeded.
Sun Jul  8 23:36:01 2018 - [info] Master failover to 10.0.203.104(10.0.203.104:3358) completed successfully.
Sun Jul  8 23:36:01 2018 - [info] 

----- Failover Report -----

app1: MySQL Master failover 10.0.203.109(10.0.203.109:3358) to 10.0.203.104(10.0.203.104:3358) succeeded

Master 10.0.203.109(10.0.203.109:3358) is down!

Check MHA Manager logs at localhost.localdomain:/var/log/masterha/app1/manager.log for details.

Started automated(non-interactive) failover.
Selected 10.0.203.104(10.0.203.104:3358) as a new master.
10.0.203.104(10.0.203.104:3358): OK: Applying all logs succeeded.
10.0.203.117(10.0.203.117:3358): OK: Slave started, replicating from 10.0.203.104(10.0.203.104:3358)
10.0.203.104(10.0.203.104:3358): Resetting slave info succeeded.
Master failover to 10.0.203.104(10.0.203.104:3358) completed successfully.

GTID的漫长化

GTID相关的消息囤积在binlog文件中,为此MySQL5.6新扩充了上面2个binlog事件。

  • Previous_gtids_log_event在各种binlog文件的发端部分,记录在该binlog文件在此之前已实行的GTID集合。
  • Gtid_log_event即眼下见到的set gtid_next
    …,它现身在每一种业务的前头,评释下叁个作业的gtid。

示范如下:

mysql> show binlog events IN 'binlog.000015';+---------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+| Log_name      | Pos | Event_type     | Server_id | End_log_pos | Info                                                              |+---------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+| binlog.000015 |   4 | Format_desc    |         1 |         120 | Server ver: 5.6.31-77.0-log, Binlog ver: 4                        || binlog.000015 | 120 | Previous_gtids |         1 |         191 | e10c75be-5c1b-11e6-ab7c-000c296078ae:1-6                          || binlog.000015 | 191 | Gtid           |         1 |         239 | SET @@SESSION.GTID_NEXT= 'e10c75be-5c1b-11e6-ab7c-000c296078ae:7' || binlog.000015 | 239 | Query          |         1 |         318 | BEGIN                                                             || binlog.000015 | 318 | Query          |         1 |         418 | use `test`; insert into tbx1 values(1)                            || binlog.000015 | 418 | Xid            |         1 |         449 | COMMIT /* xid=13 */                                               || binlog.000015 | 449 | Gtid           |         1 |         497 | SET @@SESSION.GTID_NEXT= 'e10c75be-5c1b-11e6-ab7c-000c296078ae:8' || binlog.000015 | 497 | Query          |         1 |         576 | BEGIN                                                             || binlog.000015 | 576 | Query          |         1 |         676 | use `test`; insert into tbx1 values(1)                            || binlog.000015 | 676 | Xid            |         1 |         707 | COMMIT /* xid=17 */                                               || binlog.000015 | 707 | Gtid           |         1 |         755 | SET @@SESSION.GTID_NEXT= 'e10c75be-5c1b-11e6-ab7c-000c296078ae:9' || binlog.000015 | 755 | Query          |         1 |         834 | BEGIN                                                             || binlog.000015 | 834 | Query          |         1 |         934 | use `test`; insert into tbx1 values(1)                            || binlog.000015 | 934 | Xid            |         1 |         965 | COMMIT /* xid=20 */                                               |+---------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+14 rows in set (0.00 sec) 

MySQL服务器运维时,通过读binlog文件,起头化gtid_executed和gtid_purged,使它们的值能和上次MySQL运营时同样。

  • gtid_executed被安装为新型的binlog文件中Previous_gtids_log_event和所有Gtid_log_event的并集。
  • gtid_purged为最老的binlog文件中Previous_gtids_log_event。

是因为那一个根本的变量值记录在binlog中,所以开启gtid_mode时必需同期在主库上开启log_bin在备库上开启log_slave_updates。

唯独,在MySQL5.7中绝非那些界定。MySQL5.7中,新增三个种类表mysql.gtid_executed用于持久化已实行的GTID会集。当主库上未有拉开log_bin或在备库上从未有过展开log_slave_updates时,mysql.gtid_executed会跟客户业务一同每一遍换代。不然只在binlog日志发生rotation时更新mysql.gtid_executed。

##==========================================##
依赖GTID举行故障切换的基准:
1、全数节点开启GTID情势,设置gtid_mode=1
2、全数节点上Executed_Gtid_Set不为空
3、最少三个节点使用Auto_Position=1

MTS下蓄意的标题

在同一时候利用MTS(slave_parallel_workers> 1卡塔尔国时,就算按下边crash safe
slave的渴求安装了依赖GTID的复制,Slave crash后再重启还是会促成复制中断。

透过强迫杀掉MySQL所在虚机的主意模拟Slave宕机,然后再起步MySQL,mysql日志中好似下错误消息:

---------------------------------2016-10-26 21:00:23 2699 [Warning] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--relay-log=mysql-relay-bin' to avoid this problem.2016-10-26 21:00:24 2699 [Note] Slave: MTS group recovery relay log info based on Worker-Id 1, group_relay_log_name ./mysql-relay-bin.000011, group_relay_log_pos 2017523 group_master_log_name binlog.000007, group_master_log_pos 20173632016-10-26 21:00:24 2699 [ERROR] Error looking for file after ./mysql-relay-bin.000012.2016-10-26 21:00:24 2699 [ERROR] Failed to initialize the master info structure2016-10-26 21:00:24 2699 [Note] Check error log for additional messages. You will not be able to start replication until the issue is resolved and the server restarted.2016-10-26 21:00:24 2699 [Note] Event Scheduler: Loaded 0 events2016-10-26 21:00:24 2699 [Note] mysqld: ready for connections.Version: '5.6.31-77.0-log'  socket: '/data/mysql/mysql.sock'  port: 3306  Percona Server (GPL), Release 77.0, Revision 5c1061c--------------------------------- 

起首slave时也会报错

mysql> start slave;ERROR 1872 (HY000): Slave failed to initialize relay log info structure from the repository 

现身这种光景的原因在于,relay_log_recovery=1且slave_parallel_workers>1的动静下,mysql运行时会跻身MTS
Group复苏流程,即读取relay
log,尝试增补由于三十四线程复制招致的gap。然后relay
log文件由于不是实时刷新的,在relay log文件中找不到gap对应的relay
log记录(覆盖了gap的relay
log开头和得了地方分别被称呼低水位和高水位,低水位点即slave_relay_log_info.Relay_log_pos的值State of Qatar就能够报这么些错。

其实,在GTID格局下,slave在apply
event的时候能够跳过重复事件,所以可以悠闲自在的从低水位点应用日志,没要求深入分析relay
log文件。
那看起来是一个bug,于是提交了二个bug报告#83713,近年来还一直不接过回复。

作为隐藏方法,能够透过消逝relay log文件,跳过那几个荒唐。试行步骤如下

reset slave;change master to MASTER_AUTO_POSITION = 1start slave; 

在这里处,单纯的调reset
slave不能够把状态清理深透,内部的Relay_log_info.inited标识位还是处于未被发轫化状态,那时候调用start
slave仍旧会战败。因而须要补一刀change master。

www.350.vip 2

因而Xtrabackup进行备份

比较mysqldump,Xtrabackup是效用越来越高何况被大规模选拔的备份形式。使用Xtrabackup实行备份的比如如下。

由此Xtrabackup创八个全量备份(能够在Slave上创造备份,以制止对主库的天性冲击)

innobackupex --defaults-file=/etc/my.cnf --host=127.1 --user=root --password=mysql --no-timestamp --safe-slave-backup --slave-info /mysql/bak 

选拔日志

innobackupex --apply-log /mysql/bak 

查看备份目录中的xtrabackup_binlog_info文件能够找到备份时生龙活虎度举办过的gtids

[[email protected] ~]# cat /mysql/bak/xtrabackup_binlog_infomysql_bin.000001    191 e10c75be-5c1b-11e6-ab7c-000c296078ae:1-10 

出于备份时增加了”–slave-info”选项並且从Slave节点拉取的备份,所以会生成xtrabackup_slave_info文件,也得以从这一个文件里搜求构建复制的SQL语句。

[[email protected] ~]# cat /mysql/bak/xtrabackup_slave_infoSET GLOBAL gtid_purged='e10c75be-5c1b-11e6-ab7c-000c296078ae:1-10';CHANGE MASTER TO MASTER_AUTO_POSITION=1 

将备份文件传送到新的节点node3的/mysql/bak目录并恢复生机(借使直接把备份传输到数码目录了,这一步能够大约卡塔尔国。

[[email protected] ~]# innobackupex --defaults-file=/etc/my.cnf --copy-back /mysql/bak 

启动MySQL。

[[email protected] ~]# mysqld --defaults-file=/home/mysql/etc/my.cnf --skip-slave-start & 

假如是从Slave拉的备份,绝对不可以一直张开Slave复制,这个时候的gtid_executed是谬误的。须求手动设置gtid_purged后再start
slave

reset master;SET GLOBAL gtid_purged='e10c75be-5c1b-11e6-ab7c-000c296078ae:1-10';CHANGE MASTER TO MASTER_HOST='node1',MASTER_USER='repl',MASTER_PASSWORD='repl',MASTER_AUTO_POSITION=1;start slave; 

万一聚积因为某种原因引致基本节点上的Executed_Gtid_Set不同,如:

GTID与crash safe slave

crash safe slave是MySQL 5.6提供的作用,意思是说在slave
crash后,把slave重新拉起来能够世襲从Master实行复制,不会并发复制错误也不会并发数量不均等。

1、通过RESET MASTER和SET GLOBAL
gtid_purged=”使得全部节点有所风华正茂致的GTID 集结

前言

GTID(Global Transaction
ID卡塔尔国是MySQL5.6引进的效率,能够在集群全局范围标记事务,用于代替过去因此binlog文件偏移量定位复制地方的古板方法。依靠GTID,在发出主备切换的气象下,MySQL的别样Slave可以活动在新主上找到准确的复制地方,那大大简化了复杂复制拓扑下集群的维护,也缩小了人工设置复制地点发生误操作的高危机。此外,基于GTID的复制能够忽视已经实行过的业务,降低了数据产生不风姿浪漫致的高危害。

GTID虽好,要想接受熟练还需足够理解其规律与风味,非常要专心与价值观的基于binlog文件偏移量复制情势不相同等的地点。本文概述了关于GTID的多少个左近难点,希望能对通晓和选用基于GTID的复制有所扶助。

发表评论

电子邮件地址不会被公开。 必填项已用*标注