2021-01-06
§
|
16:58 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mc2026.codfw.wmnet with reason: REIMAGE |
[production] |
16:56 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mc1026.eqiad.wmnet with reason: REIMAGE |
[production] |
16:42 |
<razzi@cumin1001> |
END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) |
[production] |
16:32 |
<jayme@deploy1001> |
helmfile [staging-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
16:31 |
<jayme@deploy1001> |
helmfile [staging-codfw] START helmfile.d/admin 'sync'. |
[production] |
16:15 |
<jayme@deploy1001> |
helmfile [staging-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
16:15 |
<jayme@deploy1001> |
helmfile [staging-codfw] START helmfile.d/admin 'sync'. |
[production] |
16:02 |
<razzi@cumin1001> |
START - Cookbook sre.hadoop.roll-restart-masters |
[production] |
16:01 |
<moritzm> |
installing cups security updates on buster (client-side tools/libs) |
[production] |
15:54 |
<moritzm> |
installing openexr security updates |
[production] |
15:28 |
<jayme> |
imported calico 3.17.1-1 to component/calico-future stretch-wikimedia |
[production] |
15:20 |
<moritzm> |
restarting FPM/Apache on mw canaries to pick up p11-kit update |
[production] |
15:04 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) |
[production] |
15:02 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
15:01 |
<moritzm> |
installing p11-kit security updates on stretch |
[production] |
14:58 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) |
[production] |
14:55 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
13:03 |
<moritzm> |
installing tcpdump security updates |
[production] |
12:11 |
<awight@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:654005|Create rollbacker group on mrwiki (T270864)]] (duration: 01m 21s) |
[production] |
11:33 |
<moritzm> |
installing ruby2.5 security updates |
[production] |
11:18 |
<moritzm> |
installing libjpeg-turbo security updates on buster |
[production] |
11:14 |
<moritzm> |
remove cloudceph2002-dev.wikimedia.org and cloudceph2003-dev.wikimedia.org from debmonitor (got reinstalled as .wmnet) |
[production] |
10:40 |
<jmm@cumin2001> |
dbctl commit (dc=all): 'Depool db2140', diff saved to https://phabricator.wikimedia.org/P13658 and previous config saved to /var/cache/conftool/dbconfig/20210106-104029-jmm.json |
[production] |
10:38 |
<moritzm> |
depooling db2140 T271084 |
[production] |
08:40 |
<moritzm> |
installing Linux 4.9.246 on stretch hosts (no reboots yet) |
[production] |
01:22 |
<mutante> |
testreduce1001 rm -rf /srv/deployment/parsoid/deploy |
[production] |
00:55 |
<eileen> |
process-control config revision is 9bc3d67b02 |
[production] |
00:46 |
<eileen> |
process-control config revision is c38eaa20ed |
[production] |
00:35 |
<eileen> |
civicrm revision changed from 6be8a130df to 1d5f6365ba, config revision is d8756a45c1 |
[production] |
2021-01-05
§
|
23:13 |
<ladsgroup@deploy1001> |
Synchronized php-1.36.0-wmf.25/includes/logging/LogPager.php: [[gerrit:654507|Check for the index name while it's being renamed]] (duration: 01m 06s) |
[production] |
22:26 |
<reedy@deploy1001> |
Synchronized php-1.36.0-wmf.25/extensions/AbuseFilter/extension.json: T271266 (duration: 01m 04s) |
[production] |
21:48 |
<jhuneidi@deploy1001> |
rebuilt and synchronized wikiversions files: Revert "group0 wikis to 1.36.0-wmf.22" |
[production] |
21:12 |
<razzi@cumin1001> |
END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) |
[production] |
21:02 |
<razzi@cumin1001> |
START - Cookbook sre.aqs.roll-restart |
[production] |
20:53 |
<razzi@deploy1001> |
Finished deploy [analytics/aqs/deploy@5d05f83]: Configure http request timeout and caching for T268809 (duration: 04m 48s) |
[production] |
20:50 |
<jhuneidi@deploy1001> |
rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.25 refs T267418 |
[production] |
20:48 |
<razzi@deploy1001> |
Started deploy [analytics/aqs/deploy@5d05f83]: Configure http request timeout and caching for T268809 |
[production] |
20:44 |
<razzi> |
deploy aqs (analytics query service) as part of analytics train |
[production] |
20:38 |
<rzl> |
rzl@mw1362:~$ sudo -i /usr/local/sbin/restart-php7.2-fpm |
[production] |
20:28 |
<mutante> |
repooled mw1362 |
[production] |
20:20 |
<mutante> |
mw1344 - /usr/local/sbin/restart-php7.2-fpm |
[production] |
20:04 |
<mutante> |
mw1344 - restarted apache2 - it was showing the same "partial results" error a mw1362 - no other appservers are showing up in logstash, but these were #1 and #2 source of errors |
[production] |
19:47 |
<mutante> |
depooled mw1362 |
[production] |
19:41 |
<mutante> |
mw1362 - restarted apache2 |
[production] |
19:29 |
<razzi@deploy1001> |
Finished deploy [analytics/refinery@56fb3ff] (thin): Regular analytics weekly train THIN [analytics/refinery@6ce68c950fc339dc3748cf50e6925cd1031287c4] (duration: 00m 08s) |
[production] |
19:29 |
<razzi@deploy1001> |
Started deploy [analytics/refinery@56fb3ff] (thin): Regular analytics weekly train THIN [analytics/refinery@6ce68c950fc339dc3748cf50e6925cd1031287c4] |
[production] |
19:28 |
<razzi@deploy1001> |
Finished deploy [analytics/refinery@56fb3ff]: Regular analytics weekly train [analytics/refinery@6ce68c950fc339dc3748cf50e6925cd1031287c4] (duration: 09m 37s) |
[production] |
19:19 |
<razzi@deploy1001> |
Started deploy [analytics/refinery@56fb3ff]: Regular analytics weekly train [analytics/refinery@6ce68c950fc339dc3748cf50e6925cd1031287c4] |
[production] |
19:17 |
<razzi> |
deploying refinery for weekly train |
[production] |
19:16 |
<mutante> |
mwdebug1003 - editing apache2 defaults conf and dropping ServerAdmin address.restarting |
[production] |