научная статья по теме A SUPPORT VECTOR MACHINE MODEL FOR THE PREDICTION OF MONOMER REACTIVITY RATIOS Физика

Текст научной статьи на тему «A SUPPORT VECTOR MACHINE MODEL FOR THE PREDICTION OF MONOMER REACTIVITY RATIOS»

ВЫСОКОМОЛЕКУЛЯРНЫЕ СОЕДИНЕНИЯ, Серия Б, 2011, том 53, № 9, с. 1665-1671

ТЕОРИЯ

УДК 541.64:542.952

A SUPPORT VECTOR MACHINE MODEL FOR THE PREDICTION OF MONOMER REACTIVITY RATIOS1

© 2011 г. Xinliang Yu" b and Xueye Wangb

a College of Chemistry and Chemical Engineering, Hunan Institute of Engineering, Xiangtan, Hunan 411104, China b Key Laboratory of Environmentally Friendly Chemistry and Applications of Ministry of Education, College of Chemistry,

Xiangtan University, Xiangtan, Hunan 411105, China e-mail: yxliang5602@sina.com.cn Received January 28, 2011 Revised Manuscript Received February 27, 2011

Abstract—To predict monomer reactivity ratios in radical copolymerization of monomers M1 (C1H2=C2XY) with M2 (styrene), a support vector machine model was developed. After 16 quantum chemical descriptors were calculated by the density functional theory at B3LYP level of theory with 6-31G(d) basis set, the genetic algorithm method, together with multiple linear regression analysis, was used to select the best combinations of the variables. The optimal SVM model with four descriptors (qACi, QAC 2, й and ELUMO) was obtained with the Gaussian radical basis kernel (C = 8000, s = 0.001 and у = 0.01). The root-mean-square errors for training set, validation set and test set are 0.125, 0.123 and 0.188, respectively, which are more accurate than the existing artificial neural network model. Therefore, it is reasonable to predict monomer reactivity ratios with the support vector machine method.

INTRODUCTION

The monomer reactivity ratios not only can describe relative reactivities of monomers, but also can provide valuable and precise information for the determination of microstructural parameters such as the distribution of units and sequence lengths along the macromolecular chains [1]. The copolymer composition equation which relates the composition of the initially formed copolymer and the initial monomer mixture is given by [2]

Rp - Rm

(rnRm + l)/(r21 + Rm ),

(1)

where Rm is equal to [M1]/[M2] in the monomer mixture and Rp is equal to [M1]/[M2] in the polymer formed. It would be extremely useful to obtain the values of r12 and r21 and hence the composition of any copolymer produced from any pair of monomers at any concentration ratios [2].

Generally, the reactivity ratios are obtained experimentally. They also can be determined using semi-empirical methods such as the Q—e scheme [3, 4] and the revised patterns scheme [5, 6]. But the semi-empirical methods are limited as the parameter values (Q, e, u and v) are not known. Yu et al. [7] developed artificial neural network models to predict monomer reactivity ratios (logr12) in radical copolymerization of monomers M1 (styrene, methyl methacrylate and acryloni-trile) with M2 (vinyl monomers). But these models have not been tested by prediction sets.

Статья печатается в представленном авторами виде.

In recent years, support vector machine (SVM) has become one of the most promising learning algorithms for classification and regression due to many attractive features and successful applications. The goal of this paper is to produce robust SVM model that could predict the monomer reactivity ratios logr1S in radical copolymerization of monomers M1 (C1H2=C2XY) with M2 (styrene).

MATERIAL AND METHODS

Table shows 60 monomer reactivity ratios of radical copolymerization for vinyl monomers Mj (C1H2=C2XY) with M2 (styrene) [8]. Monomers 1 show a high degree of structural variety. For example, the functional groups present in the side chains include acids, aldehydes, amides, nitrile, ketones, ha-lides, esters, sulfides, aromatic rings, non-aromatic rings, and so on. The logarithms of monomer reactivity ratios r1s are used because the spread of the data sets is more even when log r1s is used instead of r1s. Moreover, the logarithmic form of monomer reactivity ratios provides a more convenient linear solution for the Q—e scheme [3, 4] and the revised patterns scheme [5, 6]. The data set of monomer reactivity ratios (see table) was randomly split into training, validation and prediction sets of 30, 15 and 15 monomers, respectively.

To fit reactivity ratios log r1s, 16 descriptors were calculated using the density functional theory (DFT) in Gaussian 03 [9] program at B3LYP level of theory

1

1666

XINLIANG YU, XUEYE WANG

Descriptors used and monomer reactivity ratios for 60 monomers

No. Monomers <*aC Q 2 ^AC log >1S

-^LUMO Exp. Calc.

Training set

1 Vinyl acetate -0.209225 0.537494 1.6807 -0.01035 -1.6990 -1.3023

2 Vinyl bromide -0.184515 0.322882 1.4905 -0.00276 -1.2680 -1.0184

3 Vinyl chloride -0.178405 0.388537 1.6128 -0.00130 -1.2600 -1.2696

4 Vinyl chloromethyl ketone -0.009070 -0.158362 4.4722 -0.07280 -0.2950 -0.3047

5 Vinyl dichloroacetate -0.182978 0.533141 1.2219 -0.05583 -0.5530 -0.5428

6 Vinyl methyl ketone -0.012812 -0.158676 3.1137 -0.05691 -0.4950 -0.4558

7 Vinyl phenyl sulfide -0.167266 0.179144 1.3838 -0.01269 -0.8540 -0.5730

8 Vinyl stearate -0.224553 0.578319 1.5330 -0.00786 -1.3010 -1.3114

9 Vinyl tert-butyl sulfide -0.212891 0.224253 1.8887 0.02011 -0.8013 -0.8117

10 Vinyl 2-chloroethyl ether -0.291438 0.618845 1.0868 0.01693 -1.1550 -1.1447

11 Styrene, p-chloromethyl- -0.139193 0.094446 2.4578 -0.04866 0.0492 0.0388

12 Styrene, p-methyl- -0.163257 0.144259 0.6170 -0.02707 -0.0031 -0.0135

13 Styrene, p-l-(2-hydroxypro- pyl)- -0.168992 0.151401 1.9882 -0.02459 -0.0410 -0.1935

14 Acrylate, a-chloro-, methyl -0.143005 0.224688 3.5147 -0.05076 -0.5229 -0.5126

15 Acrylate, a-cyano-, methyl -0.034339 -0.016851 5.5808 -0.08472 -0.2147 -0.2051

16 Acrylate, ethyl -0.001391 -0.164429 1.6646 -0.04240 -0.7696 -0.7592

17 Acrylate, methyl -0.001352 -0.154921 1.4969 -0.04405 -0.7447 -0.7536

18 Acrylate, octadecyl -0.015417 -0.139835 2.3856 -0.04263 -0.5850 -0.5953

19 Methacrylate, 2,2,6,6-tetra-methyl-4-piperidinyl -0.059297 -0.104476 1.2008 -0.03683 -0.5229 -0.4406

20 Methacrylate, 2-bromoethyl -0.075200 -0.067634 2.8452 -0.04647 -0.3872 -0.2525

21 Methacrylate, benzyl -0.079629 -0.065409 2.0021 -0.03776 -0.3279 -0.2990

22 Methacrylate, butyl -0.080664 -0.060589 1.8956 -0.03604 -0.2757 -0.3300

23 Methacrylate, glycidyl -0.061043 -0.100338 2.1421 -0.04089 -0.3010 -0.3627

24 Methacrylate, isobutyl -0.078891 -0.058047 1.9092 -0.03681 -0.3768 -0.3355

25 Methacrylate, methyl -0.075020 -0.057835 1.6753 -0.03818 -0.3372 -0.3468

26 Naphthalene, 1-vinyl- -0.114970 0.084520 0.1210 -0.04457 0.3054 0.3153

27 Pyridine, 2-methyl-5-vinyl- -0.145218 0.119726 1.7369 -0.03716 -0.0706 -0.0603

28 Pyridine, 2-vinyl- -0.086862 -0.028320 1.8107 -0.04106 0.1004 -0.2346

29 Pyridine, 4-vinyl- -0.099799 0.037907 2.4607 -0.05125 -0.1612 -0.1716

30 Acrolein 0.019444 -0.177162 Validation set 3.1609 -0.06506 -0.5686 -0.5785

31 Methacrylate,3,5-dimethylad-amantyl -0.067817 -0.106720 1.5753 -0.03652 -0.2007 -0.3665

32 Itaconic anhydride -0.003363 -0.152358 4.8068 -0.07674 -0.2596 -0.2562

33 Hexatriene, tetrachloro- -0.055149 0.004375 2.4091 -0.07042 -0.0706 -0.2397

34 Styrene,P-acetoxy- -0.160990 0.142581 1.6986 -0.03490 0.1004 0.0144

35 Tetrazole, 1-vinyl- -0.155896 0.374473 5.4083 -0.05513 -0.7352 -0.8857

36 Acrylamide, N-methylol -0.041787 -0.091072 3.5004 -0.04772 -0.1549 -0.3555

37 Pyridine, 2-vinyl-5-ethyl- -0.127255 0.069087 2.2222 -0.03792 0.0374 -0.1629

38 Vinyl ethyl sulfide -0.178523 0.177255 1.6443 0.00808 -0.7400 -0.8920

39 Vinyl hendecanoate -0.223756 0.575421 1.5823 -0.00790 -1.3010 -1.3153

Table. (Contd.)

No. Monomers qAC1 Q 2 AC E log

ELUMO Exp. Calc.

40 p -Vinylbenzylmethylcarbinol —0.158462 0.134567 1.4136 —0.03498 —0.0270 0.0706

41 Styrene —0.146739 0.116960 0.1907 —0.03054 0.0000 0.0609

42 Acrylate, a-phenyl-, methyl —0.155336 0.028886 4.0697 —0.04292 0.1072 0.0884

43 Acrylate, benzyl —0.013452 —0.142093 2.4432 —0.04471 —0.6990 —0.5839

44 Acrylate, butyl —0.001012 —0.170791 1.7185 —0.04178 —0.7447 —0.7518

45 Methacrylic acid —0.059861 —0.060163 Prediction set 1.5819 —0.04337 —0.2807 —0.3751

46 Methacrylonitrile —0.091307 0.144638 3.8840 —0.04510 —0.4815 —0.5573

47 Methacrylamide, N-phenyl- —0.085817 —0.075495 3.4554 —0.03884 —0.0555 —0.1812

48 Methacrylate, phenyl —0.042366 —0.113254 3.8061 —0.04936 —0.2924 —0.2816

49 Methacrylate, 2-hydroxyethyl —0.078067 —0.057835 3.2751 —0.03920 —0.1938 —0.2696

50 Acrylamide —0.016236 —0.165551 3.5335 —0.03079 —0.1549 —0.2906

51 Isopropenyl isocyanate —0.260985 0.668904 2.3130 —0.01190 —1.0177 —1.1875

52 Isopropenyl methyl ketone —0.054776 —0.162201 2.8722 —0.04909 —0.3188 —0.2434

53 Oxazoline, 2-isopropenyl- —0.116250 0.054263 1.1946 —0.02564 —0.1938 —0.4437

54 Oxazoline, 2-isopropenyl-4,4-dimethyl- —0.121448 0.057456 1.4826 —0.02327 —0.1675 —0.4600

55 Silane, 3-methacryloxypropyl, trimethoxy- —0.107692 —0.037012 3.6349 —0.02992 —0.0615 —0.1170

56 p-Vinylbenzoic acid —0.113873 0.048918 4.9875 —0.06677 0.0124 —0.2028

57 Styrene, a-methyl —0.184001 0.170400 0.3137 —0.02195 —0.2219 0.0446

58 Vinyl chloroacetate —0.185126 0.505409 3.4374 —0.04997 —1.5230 —1.2450

59 Methacrylate, 2-chloroethyl —0.073993 —0.066793 2.9844 —0.04715 —0.5229 —0.2539

60 Vinylidene chloride —0.282129 0.761752 1.5146 —0.01686 —0.9686 —1.1520

with 6-31G(d) basis set. These descriptors included Mulliken charges of C1, C2 and R3 (qMCi, qMC 2, and qMR3), Mulliken charges of C1, C2 and R3 with hydrogens summed into heavy atoms (QMCi, QMC2, and Q.r „3), atomic polar tensor charges of C1, C2 and R3

MR

(q ,, q2, and q 3), atomic polar tensor charges of

C1, C2 and R3 with hydrogens summed into heavy atoms (Q i, Q 2 , and Q 3), the energies of the highest

AC AC AR

occupied molecular orbital (EHOMO) and the lowest unoccupied molecular orbital (ELUMO), LUMO and HOMO orbital energy difference (AEg — Elumo — - Eh

), and the total dipole moment

HOMO.

The genetic algorithm (GA) method was used to select an optimum subset of descriptors for SVM models. For the last few years, the GA

Для дальнейшего прочтения статьи необходимо приобрести полный текст. Статьи высылаются в формате PDF на указанную при оплате почту. Время доставки составляет менее 10 минут. Стоимость одной статьи — 150 рублей.

Показать целиком