2021-12-07
§
|
07:37 |
<oblivian@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
07:34 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1162 (T277354)', diff saved to https://phabricator.wikimedia.org/P18043 and previous config saved to /var/cache/conftool/dbconfig/20211207-073413-marostegui.json |
[production] |
07:33 |
<oblivian@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
07:32 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1162 (T277354)', diff saved to https://phabricator.wikimedia.org/P18042 and previous config saved to /var/cache/conftool/dbconfig/20211207-073252-marostegui.json |
[production] |
07:32 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1162.eqiad.wmnet with reason: Maintenance T277354 |
[production] |
07:32 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db1162.eqiad.wmnet with reason: Maintenance T277354 |
[production] |
07:23 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: Maintenance T277354 |
[production] |
07:23 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on 8 hosts with reason: Maintenance T277354 |
[production] |
07:23 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T277354)', diff saved to https://phabricator.wikimedia.org/P18041 and previous config saved to /var/cache/conftool/dbconfig/20211207-072311-marostegui.json |
[production] |
07:16 |
<marostegui> |
power off db2074, db2078, db2101, db2130, dbproxy2004 T296930 |
[production] |
07:08 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P18040 and previous config saved to /var/cache/conftool/dbconfig/20211207-070806-marostegui.json |
[production] |
06:53 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P18039 and previous config saved to /var/cache/conftool/dbconfig/20211207-065301-marostegui.json |
[production] |
06:37 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T277354)', diff saved to https://phabricator.wikimedia.org/P18038 and previous config saved to /var/cache/conftool/dbconfig/20211207-063756-marostegui.json |
[production] |
06:36 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1105:3312 (T277354)', diff saved to https://phabricator.wikimedia.org/P18037 and previous config saved to /var/cache/conftool/dbconfig/20211207-063621-marostegui.json |
[production] |
06:36 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1105.eqiad.wmnet with reason: Maintenance T277354 |
[production] |
06:36 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db1105.eqiad.wmnet with reason: Maintenance T277354 |
[production] |
06:35 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance T277354 |
[production] |
06:35 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance T277354 |
[production] |
06:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1100 (T277354)', diff saved to https://phabricator.wikimedia.org/P18036 and previous config saved to /var/cache/conftool/dbconfig/20211207-063140-marostegui.json |
[production] |
06:16 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1100', diff saved to https://phabricator.wikimedia.org/P18035 and previous config saved to /var/cache/conftool/dbconfig/20211207-061635-marostegui.json |
[production] |
06:14 |
<marostegui> |
Apply SET GLOBAL innodb_checksum_algorithm=full_crc32; on db1107 T287244 |
[production] |
06:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1100', diff saved to https://phabricator.wikimedia.org/P18034 and previous config saved to /var/cache/conftool/dbconfig/20211207-060130-marostegui.json |
[production] |
05:58 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db2074 and db2130 T296930', diff saved to https://phabricator.wikimedia.org/P18033 and previous config saved to /var/cache/conftool/dbconfig/20211207-055808-marostegui.json |
[production] |
05:46 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'After maintenance db1100 (T277354)', diff saved to https://phabricator.wikimedia.org/P18032 and previous config saved to /var/cache/conftool/dbconfig/20211207-054625-marostegui.json |
[production] |
05:45 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1100 (T277354)', diff saved to https://phabricator.wikimedia.org/P18031 and previous config saved to /var/cache/conftool/dbconfig/20211207-054506-marostegui.json |
[production] |
05:45 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1100.eqiad.wmnet with reason: Maintenance T277354 |
[production] |
05:45 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:00:00 on db1100.eqiad.wmnet with reason: Maintenance T277354 |
[production] |
00:10 |
<cwhite> |
end codfw opensearch upgrade T288621 |
[production] |
2021-12-06
§
|
22:19 |
<mstyles@deploy1002> |
Synchronized php-1.38.0-wmf.9/includes/content/ContentModelChange.php: Deploy security patch for T271037 (duration: 00m 56s) |
[production] |
20:14 |
<cwhite> |
begin codfw opensearch upgrade T288621 |
[production] |
20:14 |
<cwhite> |
begin codfw opensearch upgrade T288612 |
[production] |
19:58 |
<legoktm> |
trying new dump of Special:CodeReview on mwmaint1002 (T205361) |
[production] |
19:26 |
<legoktm> |
installing php-yaml on all appservers |
[production] |
19:08 |
<damilare> |
updated civicrm from b82183b9 to 311382de |
[production] |
19:04 |
<taavi@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:742835|bnwikibooks: add autopatrolled and patroller user groups (T296640)]] (duration: 00m 56s) |
[production] |
19:03 |
<cmooney@cumin1001> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirt1028.eqiad.wmnet with OS buster |
[production] |
19:02 |
<cmooney@cumin1001> |
START - Cookbook sre.hosts.reimage for host cloudvirt1028.eqiad.wmnet with OS buster |
[production] |
19:02 |
<cmooney@cumin1001> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirt1028.eqiad.wmnet with OS buster |
[production] |
19:00 |
<cmooney@cumin1001> |
START - Cookbook sre.hosts.reimage for host cloudvirt1028.eqiad.wmnet with OS buster |
[production] |
18:52 |
<cmooney@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1028.eqiad.wmnet with OS buster |
[production] |
18:45 |
<cmooney@cumin1001> |
START - Cookbook sre.hosts.reimage for host cloudvirt1028.eqiad.wmnet with OS buster |
[production] |
18:43 |
<cmooney@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1028.eqiad.wmnet with OS buster |
[production] |
18:34 |
<cmooney@cumin1001> |
START - Cookbook sre.hosts.reimage for host cloudvirt1028.eqiad.wmnet with OS buster |
[production] |
18:00 |
<majavah> |
"foreachwiki namespaceDupes.php --fix | tee namespaceDupes-T293839-fix.txt" FINISHED about 15 minutes ago T293839 |
[production] |
17:27 |
<ebernhardson@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: T296897 Move cirrus traffic to codfw (duration: 00m 56s) |
[production] |
16:24 |
<majavah> |
starting "foreachwiki namespaceDupes.php --fix | tee namespaceDupes-T293839-fix.txt" in mwmaint1002 screen session, T293839 |
[production] |
15:55 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ganeti2012.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage |
[production] |
15:55 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on ganeti2012.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage |
[production] |
14:45 |
<elukey> |
roll restart of nfacctd on netflow* nodes to pick up the new CA bundle for librdkafka |
[production] |
14:19 |
<moritzm> |
draining primary/secondary instances off ganeti2012 T296622 |
[production] |