Why can't mySQL's primary key use uuid?

Jun 01, 2021 Article blog

3. Second, the use of uuid and self-reinforcing id index structure comparison

Original link: cnblogs.com/wyq178/p/12548864.html

preface

When designing tables in mysql, mysql officially recommends not using uuid or non-continuous non-repeating 雪花id (long-shaped and unique, stand-alone increments), but recommending continuous self-increasing primary key ids, the official recommendation is auto_increment so why not recommend uuid what's the harm of using uuid

In this article, let's analyze the problem and explore the internal causes.

First, mysql and program instances

1.1. To illustrate this problem, let's start by creating three tables

They are user_auto_key user_uuid user_random_key respectively, representing the auto-growth primary key, uuid as the primary key, random key as the primary key, and the rest we remain completely unchanged.

Depending on the control variable method, we only generate each table's primary key with a different policy, while the other fields are exactly the same, and then test the table's insertion speed and query speed:

Note: The random key here actually refers to the inconsistent and non-repeated irregular id calculated by the snowflake algorithm: a string of 18-bit-length long values

id automatically generates tables:

Why can't mySQL's primary key use uuid?1

User uuid table

Why can't mySQL's primary key use uuid?2

Random primary key table:

Why can't mySQL's primary key use uuid?3

1.2. The theory alone is not possible, go directly to the program, use the jdbcTemplate of spring to achieve additional inspection testing:

Technical framework: springboot , jdbcTemplate , junit+hutool , the principle of the program is to connect their own test database, and then write the same amount of data in the same environment, to analyze insert insertion time to integrate its efficiency, in order to achieve the most realistic effect, all the data is randomly generated, such as names, mailboxes, addresses are randomly generated.

package com.wyq.mysqldemo;
import cn.hutool.core.collection.CollectionUtil;
import com.wyq.mysqldemo.databaseobject.UserKeyAuto;
import com.wyq.mysqldemo.databaseobject.UserKeyRandom;
import com.wyq.mysqldemo.databaseobject.UserKeyUUID;
import com.wyq.mysqldemo.diffkeytest.AutoKeyTableService;
import com.wyq.mysqldemo.diffkeytest.RandomKeyTableService;
import com.wyq.mysqldemo.diffkeytest.UUIDKeyTableService;
import com.wyq.mysqldemo.util.JdbcTemplateService;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.util.StopWatch;
import java.util.List;
@SpringBootTest
class MysqlDemoApplicationTests {


    @Autowired
    private JdbcTemplateService jdbcTemplateService;


    @Autowired
    private AutoKeyTableService autoKeyTableService;


    @Autowired
    private UUIDKeyTableService uuidKeyTableService;


    @Autowired
    private RandomKeyTableService randomKeyTableService;




    @Test
    void testDBTime() {


        StopWatch stopwatch = new StopWatch("执行sql时间消耗");




        /**
         * auto_increment key任务
         */
        final String insertSql = "INSERT INTO user_key_auto(user_id,user_name,sex,address,city,email,state) VALUES(?,?,?,?,?,?,?)";


        List<UserKeyAuto> insertData = autoKeyTableService.getInsertData();
        stopwatch.start("自动生成key表任务开始");
        long start1 = System.currentTimeMillis();
        if (CollectionUtil.isNotEmpty(insertData)) {
            boolean insertResult = jdbcTemplateService.insert(insertSql, insertData, false);
            System.out.println(insertResult);
        }
        long end1 = System.currentTimeMillis();
        System.out.println("auto key消耗的时间:" + (end1 - start1));


        stopwatch.stop();




        /**
         * uudID的key
         */
        final String insertSql2 = "INSERT INTO user_uuid(id,user_id,user_name,sex,address,city,email,state) VALUES(?,?,?,?,?,?,?,?)";


        List<UserKeyUUID> insertData2 = uuidKeyTableService.getInsertData();
        stopwatch.start("UUID的key表任务开始");
        long begin = System.currentTimeMillis();
        if (CollectionUtil.isNotEmpty(insertData)) {
            boolean insertResult = jdbcTemplateService.insert(insertSql2, insertData2, true);
            System.out.println(insertResult);
        }
        long over = System.currentTimeMillis();
        System.out.println("UUID key消耗的时间:" + (over - begin));


        stopwatch.stop();




        /**
         * 随机的long值key
         */
        final String insertSql3 = "INSERT INTO user_random_key(id,user_id,user_name,sex,address,city,email,state) VALUES(?,?,?,?,?,?,?,?)";
        List<UserKeyRandom> insertData3 = randomKeyTableService.getInsertData();
        stopwatch.start("随机的long值key表任务开始");
        Long start = System.currentTimeMillis();
        if (CollectionUtil.isNotEmpty(insertData)) {
            boolean insertResult = jdbcTemplateService.insert(insertSql3, insertData3, true);
            System.out.println(insertResult);
        }
        Long end = System.currentTimeMillis();
        System.out.println("随机key任务消耗时间:" + (end - start));
        stopwatch.stop();




        String result = stopwatch.prettyPrint();
        System.out.println(result);
    }

1.3. Program write results

user_key_auto Write Results:

Why can't mySQL's primary key use uuid?4

user_random_key Write Results:

Why can't mySQL's primary key use uuid?5

user_uuid table write results:

Why can't mySQL's primary key use uuid?6

1.4. Efficiency test results

Why can't mySQL's primary key use uuid?7

At 130W: Let's test insert 10w again to see what happens:

Why can't mySQL's primary key use uuid?8

It can be seen that at about 100W of data, uuid insertion efficiency is at the bottom, and 130W of data is added in the latter sequence, uudi time plummets.

auto_key overall efficiency ranking for time occupancy is: auto_key > random_key uuid uuid is the least efficient, and in the case of large amounts of data, efficiency plummets. S o why is this happening? With questions, let's explore this question:

Second, the use of uuid and self-reinforcing id index structure comparison

2.1. Use the internal structure of the self-increasing id

Why can't mySQL's primary key use uuid?9

The value of the self-increasing primary key is sequential, so Innodb stores each record behind a record. When the maximum fill factor for the page is reached (the default maximum fill factor for innodb is 15/16 of the page size, leaving 1/16 of the space for later modifications):

The next record is written to a new page, and once the data is loaded in this order, the primary key page fills up almost sequential records, increasing the maximum fill rate of the page without the waste of pages
The newly inserted row is bound to be on the next row of the original maximum data row, and mysql is positioned and addressed quickly without additional consumption to calculate the location of the new row
Reduces page fragmentation and fragmentation

2.2. Use the index internal structure of uuid

Why can't mySQL's primary key use uuid?10

Because uuid relatively sequential self-increasing id is irregular, and the value of the new row does not have to be larger than the value of the previous primary key, innodb cannot always insert the new row into the end of the index, but needs to find a new suitable location for the new row to allocate new space.

This process requires a lot of extra work, and the disorder of the data can lead to a fragmentation of the data distribution, which can lead to the following problems:

The written target page is likely to have been flushed to disk and removed from the cache, or has not yet been loaded into the cache, innodb has to find and read the target page from disk into memory before inserting, which results in a large number of random IOs
Because writes are out of order, innodb has to do frequent page splitting operations to allocate space for new rows, which causes large amounts of data to be moved and at least three pages to be modified at a time
Because of frequent page fragmentation, pages become sparse and irregularly populated, eventually causing fragmentation of the data

After loading random values (uuid and snowflake id) into the cluster index (the default index type for innodb), there is sometimes a need for OPTIMEIZE TABLE to rebuild the table and optimize the filling of the page, which in turn will take some time.

Conclusion: Use innodb to insert as much as possible in the self-reinforcing order of the primary key, and whenever possible to insert new rows using the value of the monotony of the increased cluster key

2.3. Use the disadvantages of self-increasing id

So is there no harm in using a self-growing id? No, self-increasing id also has the following problems:

Once someone climbs into your database, they can get information about your business growth based on the self-growing id of the database, making it easy to analyze your operations
For high concurrent loads, innodb creates significant lock contention when inserting at the primary key, and the upper bound of the primary key becomes a hot spot for contention because all inserts occur here, and concurrent insertion can cause gap lock competition
Auto_Increment lock mechanism will cause self-reinforcing lock snatching, there is a certain performance loss

Attached: Auto_increment Lock Contention IssueS, If You Want To Improve The Configuration That RequireS tuning Innodb_autoinc_lock_mode

Third, summary

This article begins with the opening question, builds a table, and uses jdbcTemplate to test the data insertion performance of the large amount of data using jdbcTemplate to test different ids, and then analyzes the different mechanisms of id in mysql's index structure and advantages and disadvantages, explaining in depth why uuid and randomly do not repeat id's performance loss in data insertion, explaining this problem in detail.

In the actual development or according to mysql's official recommendation is best to use self-increasing id, mysql is broad and profound, there are many internal points worth optimizing for us to learn.

The above is W3Cschool编程狮 on mySQL's main key why can not use uuid related to the introduction, I hope to help you.