2020-06-25
§
|
08:14 |
<jynus> |
restarting bacula-dir on backup1001 |
[production] |
08:09 |
<akosiaris> |
restart etherpad-lite on etherpad1002 |
[production] |
08:03 |
<marostegui> |
Failover m1 from db1135 to db1097 - T254556 |
[production] |
07:52 |
<jynus> |
stop bacula-director on backup1001 for db maintenance T254556 |
[production] |
07:49 |
<akosiaris@cumin1001> |
END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) |
[production] |
07:49 |
<akosiaris@cumin1001> |
START - Cookbook sre.ganeti.makevm |
[production] |
07:49 |
<akosiaris@cumin1001> |
END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) |
[production] |
07:49 |
<akosiaris@cumin1001> |
START - Cookbook sre.ganeti.makevm |
[production] |
07:49 |
<akosiaris@cumin1001> |
END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) |
[production] |
07:48 |
<akosiaris@cumin1001> |
START - Cookbook sre.ganeti.makevm |
[production] |
07:48 |
<akosiaris@cumin1001> |
END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) |
[production] |
07:47 |
<akosiaris@cumin1001> |
START - Cookbook sre.ganeti.makevm |
[production] |
07:36 |
<elukey> |
reboot an-launcher1001 for kernel upgrades |
[production] |
07:18 |
<elukey> |
reboot kafkamon* vms for kernel upgrades |
[production] |
07:08 |
<marostegui> |
Start pre switchover steps on m1 T254556 |
[production] |
06:40 |
<elukey> |
reboot matomo1002 for kernel upgrades |
[production] |
06:35 |
<elukey> |
reboot archiva1002 (new vm, not yet in service) for kernel upgrades |
[production] |
06:34 |
<elukey> |
reboot archiva for kernel upgrades |
[production] |
06:31 |
<elukey> |
force puppet run on ores1003/1005 to restore celery (killed by the oom) |
[production] |
06:24 |
<elukey> |
reboot an-tool* vms for kernel upgrades |
[production] |
06:23 |
<elukey> |
reboot analytics-tool1004 for kernel upgrades (Superset host) |
[production] |
06:22 |
<elukey> |
reboot analytics-tool1001 for kernel upgrades |
[production] |
06:19 |
<elukey> |
execute ip addr flush ens5 on an-airflow1001 to clear RTNETLINK answers: File exists (error from ifup@ens5.service) |
[production] |
06:03 |
<elukey> |
reboot an-airflow1001 for kernel upgrades |
[production] |
04:26 |
<marostegui> |
Remove triggers from db2095:3312 - T238966 |
[production] |
04:25 |
<marostegui> |
Deploy schema change on s2 codfw - T238966 |
[production] |
00:48 |
<twentyafterfour> |
restart php-fpm on phab1001 to fix T256343 |
[production] |
00:12 |
<twentyafterfour> |
phabricator updated, all seems normal |
[production] |
00:11 |
<twentyafterfour> |
updating phabricator to release/2020-06-25/1, momentary (<1 minute) downtime expected. |
[production] |
2020-06-24
§
|
23:44 |
<mutante> |
releases2002 - systemctl stop jenkins, kill 15244 (rogue jenkins process), start jenkins with systemctl start jenkins (T247652) |
[production] |
23:43 |
<mutante> |
releases1002 - kill rogue jenkins process, start jenkins with systemctl start jenkins (T247652) |
[production] |
23:02 |
<mutante> |
releases1002/2002 - disabling puppet, removing failing cron job to pull deployment_charts (because /srv/deployment-charts does not exist yet) |
[production] |
21:45 |
<shdubsh> |
install mtail 3.0.0~rc35+wmf2 on logstash1007 - T255776 |
[production] |
20:42 |
<brennen@deploy1001> |
Synchronized php: group1 wikis to 1.35.0-wmf.38 (duration: 01m 06s) |
[production] |
20:41 |
<brennen@deploy1001> |
rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.38 |
[production] |
20:41 |
<brennen> |
train 1.35.0-wmf.38: attempting to roll forward to group1 after php-fpm restart on mw1287 (T256305, T254175) |
[production] |
20:32 |
<cdanis> |
restarting php-fpm on mw1287 T256305 |
[production] |
20:32 |
<bsitzmann@deploy1001> |
helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' . |
[production] |
20:30 |
<bsitzmann@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' . |
[production] |
20:28 |
<bsitzmann@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . |
[production] |
20:14 |
<halfak@deploy1001> |
Finished deploy [ores/deploy@1b87365]: T254505 (duration: 14m 08s) |
[production] |
20:09 |
<bsitzmann@deploy1001> |
Finished deploy [mobileapps/deploy@80c763d]: Update mobileapps to a413db4f (duration: 03m 37s) |
[production] |
20:06 |
<bsitzmann@deploy1001> |
Started deploy [mobileapps/deploy@80c763d]: Update mobileapps to a413db4f |
[production] |
20:00 |
<halfak@deploy1001> |
Started deploy [ores/deploy@1b87365]: T254505 |
[production] |
19:38 |
<otto@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Revert Migrate SearchSatisfaction from EventLogging to EventGate on group1 - T249261 (duration: 01m 06s) |
[production] |
19:17 |
<brennen@deploy1001> |
rebuilt and synchronized wikiversions files: Revert group1 wikis to 1.35.0-wmf.37 |
[production] |
19:11 |
<brennen@deploy1001> |
Synchronized php: group1 wikis to 1.35.0-wmf.38 (duration: 01m 04s) |
[production] |
19:10 |
<brennen@deploy1001> |
rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.38 |
[production] |
19:01 |
<brennen> |
train 1.35.0-wmf.38: finished triage meeting, clear to proceed to group 1 (T254175) |
[production] |
18:53 |
<joal@deploy1001> |
Finished deploy [analytics/refinery@1112749] (thin): Regular analytics weekly train THIN [analytics/refinery@1112749] (duration: 00m 09s) |
[production] |