developer tip

테이블 잠금없이 거대한 MySQL 프로덕션 테이블에 인덱스 생성

copycodes 2020. 9. 9. 08:10
반응형

테이블 잠금없이 거대한 MySQL 프로덕션 테이블에 인덱스 생성


~ 5M 행 MySQL 테이블에 인덱스를 만들어야합니다. 그것은 생산 테이블이며 CREATE INDEX 문을 실행하면 모든 것의 완전한 블록이 두려워 ...

삽입 및 선택을 차단하지 않고 해당 인덱스를 만드는 방법이 있습니까?

중지하고 색인을 생성하고 시스템을 다시 시작할 필요가 없는지 궁금합니다!


[2017] 업데이트 : MySQL 5.6은 온라인 인덱스 업데이트를 지원합니다.

https://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html

MySQL 5.6 이상에서는 인덱스가 생성되거나 삭제되는 동안 테이블을 읽기 및 쓰기 작업에 사용할 수 있습니다. CREATE INDEX 또는 DROP INDEX 문은 테이블에 액세스하는 모든 트랜잭션이 완료된 후에 만 ​​완료되므로 인덱스의 초기 상태는 테이블의 가장 최근 내용을 반영합니다. 이전에는 인덱스가 생성되거나 삭제되는 동안 테이블을 수정하면 일반적으로 테이블에서 INSERT, UPDATE 또는 DELETE 문을 취소하는 교착 상태가 발생했습니다.

[2015] MySQL 5.5에서 테이블 인덱스 블록 쓰기 업데이트

위의 답변에서 :

"데이터베이스가 온라인 상태 일 때 인덱스가 5.1보다 큰 버전을 사용하는 경우 생성됩니다. 따라서 프로덕션 시스템 사용이 중단되지 않을 것이라고 걱정하지 마십시오."

이것은 **** FALSE ****입니다 (최소한 MyISAM / InnoDB 테이블의 경우 99.999 %의 사람들이 사용합니다. Clustered Edition은 다릅니다.)

테이블에서 UPDATE 작업을 수행하면 인덱스가 생성되는 동안 BLOCK 됩니다 . MySQL은 이것에 대해 정말, 정말 어리 석습니다.

테스트 스크립트 :

(   
  for n in {1..50}; do
    #(time mysql -uroot -e 'select  * from website_development.users where id = 41225\G'>/dev/null) 2>&1 | grep real;
    (time mysql -uroot -e 'update website_development.users set bio="" where id = 41225\G'>/dev/null) 2>&1 | grep real;
  done
) | cat -n &
PID=$!
sleep 0.05
echo "Index Update - START"
mysql -uroot website_development -e 'alter table users add index ddopsonfu (last_name, email, first_name, confirmation_token, current_sign_in_ip);'
echo "Index Update - FINISH"
sleep 0.05
kill $PID
time mysql -uroot website_development -e 'drop index ddopsonfu on users;'

내 서버 (InnoDB) :

Server version: 5.5.25a Source distribution

출력 (인덱스 업데이트를 완료하는 데 걸리는 ~ 400ms 동안 6 번째 작업이 어떻게 차단되는지 확인) :

 1  real    0m0.009s
 2  real    0m0.009s
 3  real    0m0.009s
 4  real    0m0.012s
 5  real    0m0.009s
Index Update - START
Index Update - FINISH
 6  real    0m0.388s
 7  real    0m0.009s
 8  real    0m0.009s
 9  real    0m0.009s
10  real    0m0.009s
11  real    0m0.009s

차단하지 않는 읽기 작업 대 (스크립트에서 줄 주석을 바꿉니다) :

 1  real    0m0.010s
 2  real    0m0.009s
 3  real    0m0.009s
 4  real    0m0.010s
 5  real    0m0.009s
Index Update - START
 6  real    0m0.010s
 7  real    0m0.010s
 8  real    0m0.011s
 9  real    0m0.010s
...
41  real    0m0.009s
42  real    0m0.010s
43  real    0m0.009s
Index Update - FINISH
44  real    0m0.012s
45  real    0m0.009s
46  real    0m0.009s
47  real    0m0.010s
48  real    0m0.009s

다운 타임없이 MySQL의 스키마 업데이트

지금까지 MySql 스키마를 업데이트하고 가용성 중단을 겪지 않는 방법은 하나뿐입니다. 원형 마스터 :

  • 마스터 A에는 MySQL 데이터베이스가 실행 중입니다.
  • 마스터 B를 서비스로 전환하고 마스터 A에서 쓰기를 복제하도록합니다 (B는 A의 슬레이브 임).
  • Perform the schema update on Master B. It will fall behind during the upgrade
  • Let Master B catch up. Invariant: Your schema change MUST be capable of processing commands replicated from a downversion schema. Indexing changes qualify. Simple column additions usually qualify. Removing a column? probably not.
  • ATOMICALLY swap all clients from Master A to Master B. If you want to be safe (trust me, you do), you should ensure that the last write to A is replicated to B BEFORE B takes its first write. If you allow concurrent writes to 2+ masters, ... you better understand MySQL replication at a DEEP level or you are headed for a world of pain. Extreme pain. Like, do you have a column that is AUTOINCREMENT??? you are screwed (unless you use even numbers on one master and odds on the other). Do NOT trust MySQL replication to "do the right thing". It is NOT smart and will not save you. It's just slightly less safe than copying binary transaction logs from the command-line and replaying them by hand. Still, disconnecting all clients from the old master and flipping them to the new master can be done in a matter of seconds, vastly faster than waiting for a multi-hour schema upgrade.
  • Now Master B is your new master. You have the new schema. Life is good. Have a beer; the worst is over.
  • Repeat the process with Master A, upgrading his schema so that he becomes your new secondary master, ready to take over in the event that your primary master (master B now) loses power or just up and dies on you.

An easy way to update schema this isn't. Workable in a serious production environment; yes, it is. Please, please, please, if there is an easier way to add an index to a MySQL table without blocking writes, let me know.

Googling lead me to this article which describes a similar technique. Even better, they advise drinking at the same point in the proceedure (Note that I wrote my answer before reading the article)!

Percona's pt-online-schema-change

The article I linked above talks about a tool, pt-online-schema-change, that works as follows:

  • Create new table with same structure as original.
  • Update schema on new table.
  • Add a trigger on the original table so that changes are kept in-sync with the copy
  • Copy rows in batches from original table.
  • Move original table out of the way and replace with new table.
  • Drop old table.

I've never tried the tool myself. YMMV

RDS

I'm currently using MySQL through Amazon's RDS. It's a really nifty service that wraps up and manages MySQL, letting you add new read replicas with a single button and transparently upgrade the database across hardware SKU's. It's really convenient. You don't get SUPER access to the database, so you can't screw with replication directly (is this a blessing or curse?). However, you can use Read Replica Promotion to make your schema changes on a read-only slave, then promote that slave to become your new master. Exactly the same trick as I described above, just vastly easier to execute. They still don't do much to help you with the cut-over. You have to reconfigure and restart your app.


As this blog post outlines, the InnoDB ALTER TABLE mechanism has been completely redesigned for MySQL 5.6.

(For an exclusive overview of this topic, the MySQL documentation can provide an afternoon's worth of reading.)

To add an index to a table without a lock resulting on UPDATE/ INSERT, the following statement format can be used:

ALTER TABLE my_table ADD INDEX my_table__idx (my_column), ALGORITHM=INPLACE, LOCK=NONE;

MySQL 5.6 update (feb 2013): You can now perform read and write operations while an index is being created even with InnoDB tables - http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html

In MySQL 5.6 and higher, the table remains available for read and write operations while the index is being created or dropped. The CREATE INDEX or DROP INDEX statement only finishes after all transactions that are accessing the table are completed, so that the initial state of the index reflects the most recent contents of the table. Previously, modifying the table while an index is being created or dropped typically resulted in a deadlock that cancelled the INSERT, UPDATE, or DELETE statement on the table.

and:

In MySQL 5.6, this feature becomes more general: you can read and write to tables while an index is being created, and many more kinds of ALTER TABLE operations can be performed without copying the table, without blocking DML operations, or both. Thus in MySQL 5.6 and higher, we typically refer to this set of features as online DDL rather than Fast Index Creation.

from http://dev.mysql.com/doc/refman/5.6/en/glossary.html#glos_fast_index_creation


pt-online-schema-change is the way to go if you really want to make sure that the migration will not bring down the site.

As I wrote in the above comment, I have several experiences with pt-online-schema-change in production. We have our main table of 20M+ records and a master -> 2 read-only replication slaves. I've done at least a dozens of migrations with pt-online-schema-change from adding a new column, changing charset, to adding several indices. We serve tons of traffic during the migration time as well and we have not had any hiccup. Of course you'd have to test all the scripts very thoroughly before running on production.

I tried to batch up the changes into 1 script so that pt-online-schema-change only have to copy the data once. And be very careful with changing column name since you will loose your data. However, adding an index should be fine.

참고URL : https://stackoverflow.com/questions/4244685/create-an-index-on-a-huge-mysql-production-table-without-table-locking

반응형