ThinkPHP各种写入数据方式批量插入数据性能对比

飞行猿
2018-11-29 21:30:57
框架
thinkphp

昨天看了一篇关于MySQL在web环境下批量写入数据的实测和优化的博客，作者用的是CI框架，并且分别测试了Myisam和Innodb两种存储引擎，今天想着自己实测一下

但是由于平时用的框架不同，并且由于电脑配置问题，在Innodb测试时耗费过大，所以今天专门测试了一下TP框架下批量写入数据的情况，我们以写入20万数据作为测试数据

数据表设计如下：

环境：

框架：ThinkPHP5.1
PHP：7.1
MySQL：5.5

说明：由于个人电脑配置不同，测试数据的绝对值不作为参考，不过其中的相对关系可以参考一下，并且该测试只以Myisam作为存储引擎，并且测试同时不进行其他数据库操作

一、模型操作

模型文件

<?php
namespace app\index\model;

use think\Model;

class Myisam extends Model
{

}

其中不包含修改器等其他操作

在使用模型的情况下，写入数据分别有三种方式，save()、saveAll()和create()

1、save()方法，逐条写入

<?php
namespace app\index\controller;

use app\index\model\Myisam;

class Index
{
    public function test1()
    {
        //防止超300s 500错误
        set_time_limit(0);
        //设置PHP内存
        ini_set('memory_limit','512M');
        
        $t1 = microtime(true);

        $test = new Myisam();

        //插入20万条数据
        for ($i=1; $i<=200000; $i++){
            $test->isUpdate(false)->data(['value' => uniqid().$i])->save();
        }
        //程序运行时间
        $t2 = microtime(true);
        
        echo '插入耗时：'.round($t2-$t1,3).'秒<br>';

        echo '内存消耗：'.round(memory_get_usage()/1048576,2)." M<br/>";
    }
}

结果：

上面的执行插入的代码里面还涉及到几个问题，

$test->isUpdate(false)->data(['value' => uniqid().$i])->save();

如果单独执行$test->save(['value' => uniqid().$i]);，会发现最终只插入了一条数据，因为save()方法在多次调用时，第一次以后默认执行的都是更新操作，所以需要isUpdate(false)方法来使它每次都执行写入操作

此时如果执行$test->isUpdate(false)->save(['value' => uniqid().$i]);会报一个数据库插入重复主键的错，所以需要使用data()方法来传入数据

所以，save()方法实际上是不能单独多次使用的

2、saveAll()方法，批量写入

清空数据表，执行saveAll()方法

<?php
namespace app\index\controller;

use app\index\model\Myisam;

class Index
{
    public function test2()
    {
        //防止超300s 500错误
        set_time_limit(0);
        //设置PHP内存
        ini_set('memory_limit','512M');
        
        $t1 = microtime(true);

        $test = new Myisam();

        //插入20万条数据
        for ($i=1; $i<=200000; $i++){
            $data[$i] = ['value' => uniqid().$i];
        }
        //程序运行时间
        $t2 = microtime(true);
        
        echo '循环耗时：'.round($t2-$t1,3).'秒<br>';
        
        $test->saveAll($data,false);
        
        $t3 = microtime(true);
        
        echo '插入耗时：'.round($t3-$t2,3).'秒<br>';

        echo '内存消耗：'.round(memory_get_usage()/1048576,2)." M<br/>";
    }
}

结果：

可以看到，saveAll()方法内存消耗是save()方法的三倍左右，但是写入速度好像没有什么提升

因为saveAll()方法在实际执行时执行的还是单条逐条插入

3、create()方法，逐条写入

<?php
namespace app\index\controller;

use app\index\model\Myisam;

class Index
{
    public function index()
    {
        set_time_limit(0);   //防止超300s 500错误

        ini_set('memory_limit','1024M');

        $t1 = microtime(true);

        //插入20万条数据
        for ($i=1; $i<=200000; $i++){
            Myisam::create(['value' => uniqid().$i]);
        }
        //程序运行时间

        $t2 = microtime(true);
        
        echo '插入耗时：'.round($t2-$t1,3).'秒<br>';

        echo '内存消耗：'.round(memory_get_usage()/1048576,2)." M<br/>";
    }
}

结果：

看起来没有什么优势，就是使用简单一点，这种方法其实也是官方比较推荐的纯插入的的方式

二、Db类操作

Db类执行插入有insert()和insertAll()方法

1、insert()方法，逐条写入

<?php
namespace app\index\controller;

use think\Db;

class Index
{
    public function test3()
    {
        set_time_limit(0);   //防止超300s 500错误

        ini_set('memory_limit','1024M');
        
        $t1 = microtime(true);

        //插入20万条数据
        for ($i=1; $i<=200000; $i++){
            Db::name('myisam')->insert(['value' => uniqid().$i]);
        }
        //程序运行时间
        $t2 = microtime(true);
        
        echo '插入耗时：'.round($t2-$t1,3).'秒<br>';

        echo '内存消耗：'.round(memory_get_usage()/1048576,2)." M<br/>";
    }
}

结果：

2、insertAll()方法，批量写入

<?php
namespace app\index\controller;

use think\Db;

class Index
{
    public function test4()
    {
        set_time_limit(0);   //防止超300s 500错误

        ini_set('memory_limit','1024M');

        $t1 = microtime(true);

        //插入20万条数据
        for ($i=1; $i<=200000; $i++){
            $data[$i] = ['value' => uniqid().$i];
        }
        //程序运行时间
        $t2 = microtime(true);
        
        echo '循环耗时：'.round($t2-$t1,3).'秒<br>';
        
        Db::name('myisam')->insertAll($data);

        $t3 = microtime(true);
        
        echo '插入耗时：'.round($t3-$t2,3).'秒<br>';

        echo '内存消耗：'.round(memory_get_usage()/1048576,2)." M<br/>";
    }
}

结果：

数据循环结束后直接报错了，对于这个错误，在下面的文章中详细讨论：

MySQL server has gone away报错原因分析 - 飞行猿博客

尝试一下减少数据量，把200000改成100000试一下，结果又报错了

SQLSTATE[HY000]: General error: 1390 Prepared statement contains too many placeholders

这个问题在下面一篇中讨论：

MySQL 1390 Prepared statement contains too many placeholders 问题分析 - 飞行猿博客

再尝试一下减少数据量，把200000改成20000试一下

结果：

可见，插入速度和数据量不是等比的，所以框架提供了limit()方法进行分批插入

我们来试一下

<?php
namespace app\index\controller;

use think\Db;

class Index
{
    public function test3()
    {
        set_time_limit(0);   //防止超300s 500错误

        ini_set('memory_limit','1024M');

        $t1 = microtime(true);

        //插入20万条数据
        for ($i=1; $i<=200000; $i++){
            $data[$i] = ['value' => uniqid().$i];
        }
        //程序运行时间
        $t2 = microtime(true);
        
        echo '循环耗时：'.round($t2-$t1,3).'秒<br>';
        
        Db::name('myisam')->limit(20000)->insertAll($data);

        $t3 = microtime(true);
        
        echo '插入耗时：'.round($t3-$t2,3).'秒<br>';

        echo '内存消耗：'.round(memory_get_usage()/1048576,2)." M<br/>";
    }
}

结果：