2019-09-17
ยง
|
17:09 |
<urandom> |
decommissioning Cassandra, restbase2011-b -- T224553 |
[production] |
17:08 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.reboot-workers |
[production] |
17:00 |
<@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'wikifeeds' for release 'production' . |
[production] |
16:59 |
<@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'wikifeeds' for release 'production' . |
[production] |
16:21 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) |
[production] |
16:04 |
<jbond42> |
run octocatalog-diff from elnath with current facts |
[production] |
15:55 |
<reedy@deploy1001> |
Synchronized wmf-config/CommonSettings.php: Revert Set MinimumPasswordLengthToLogin to 10 for all prived groups, not just +staff (duration: 00m 55s) |
[production] |
15:53 |
<reedy@deploy1001> |
sync-file aborted: (no justification provided) (duration: 00m 01s) |
[production] |
15:53 |
<reedy@deploy1001> |
sync-file aborted: (no justification provided) (duration: 00m 00s) |
[production] |
15:39 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.reboot-workers |
[production] |
15:38 |
<urandom> |
decommissioning Cassandra, restbase2011-a -- T224553 |
[production] |
15:17 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Host down for on-site maintenance', diff saved to https://phabricator.wikimedia.org/P9120 and previous config saved to /var/cache/conftool/dbconfig/20190917-151714-marostegui.json |
[production] |
15:16 |
<marostegui> |
Stop MySQL on db2127 and shut the host down for onsite maintenance |
[production] |
14:52 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hadoop.reboot-workers (exit_code=99) |
[production] |
14:52 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.reboot-workers |
[production] |
14:51 |
<anomie@mwmaint1002> |
Running cleanupRevActorPage.php on wikitech for T232464 |
[production] |
14:51 |
<anomie@mwmaint1002> |
Running cleanupRevActorPage.php on section 8 wikis for T232464 |
[production] |
14:51 |
<anomie@mwmaint1002> |
Running cleanupRevActorPage.php on section 7 wikis for T232464 |
[production] |
14:50 |
<anomie@mwmaint1002> |
Running cleanupRevActorPage.php on section 6 wikis for T232464 |
[production] |
14:50 |
<anomie@mwmaint1002> |
Running cleanupRevActorPage.php on section 5 wikis for T232464 |
[production] |
14:50 |
<anomie@mwmaint1002> |
Running cleanupRevActorPage.php on section 4 wikis for T232464 |
[production] |
14:50 |
<anomie@mwmaint1002> |
Running cleanupRevActorPage.php on remaining section 3 wikis for T232464 |
[production] |
14:50 |
<anomie@mwmaint1002> |
Running cleanupRevActorPage.php on section 2 wikis for T232464 |
[production] |
14:50 |
<anomie@mwmaint1002> |
Running cleanupRevActorPage.php on section 1 wikis for T232464 |
[production] |
14:48 |
<anomie@mwmaint1002> |
Running cleanupRevActorPage.php on test wikis and mediawikiwiki for T232464 |
[production] |
14:39 |
<anomie@deploy1001> |
Synchronized php-1.34.0-wmf.22/includes/MergeHistory.php: Backport MergeHistory fix for T232464 [[gerrit:537436]] (duration: 00m 54s) |
[production] |
14:35 |
<ottomata> |
bouncing eventstreams service on scb hosts |
[production] |
14:15 |
<@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' . |
[production] |
14:14 |
<@> |
helmfile [CODFW] Ran 'apply' command on namespace 'cxserver' for release 'production' . |
[production] |
14:13 |
<@> |
helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' . |
[production] |
14:03 |
<herron> |
migrating kafka1003 to kafka-main1003 T225005 |
[production] |
14:00 |
<jbond42> |
forcing puppet run |
[production] |
14:00 |
<bblack> |
lvs1015 - restart pybal to remove runcommands - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/536581/ |
[production] |
13:59 |
<bblack> |
lvs2003 - restart pybal to remove runcommands - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/536581/ |
[production] |
13:57 |
<@> |
helmfile [CODFW] Ran 'apply' command on namespace 'cxserver' for release 'production' . |
[production] |
13:52 |
<bblack> |
lvs1016 - restart pybal to remove runcommands - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/536581/ |
[production] |
13:52 |
<bblack> |
lvs2006 - restart pybal to remove runcommands - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/536581/ |
[production] |
13:46 |
<@> |
helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' . |
[production] |
13:45 |
<moritzm> |
repooling restbase2010 after reimage/completed bootstrap |
[production] |
13:21 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db1130 db1104 db1085 db1086 after PDU maintenance - T227539', diff saved to https://phabricator.wikimedia.org/P9117 and previous config saved to /var/cache/conftool/dbconfig/20190917-132102-marostegui.json |
[production] |
13:17 |
<godog> |
force-run puppet in eqiad to update exported resources |
[production] |
13:14 |
<jbond42> |
currently running octocatalog-diff for all hosts from elnath |
[production] |
13:02 |
<marostegui> |
Start replication on db1130 db1104 db1085 db1086 after PDU maintenance is completed - T227539 |
[production] |
13:01 |
<cmjohnson1> |
The PDU swap in rack B3 eqiad is finished. |
[production] |
12:30 |
<mobrovac> |
bootstrap restbase2010-c - T224553 |
[production] |
11:32 |
<Urbanecm> |
EU SWAT is done |
[production] |
11:31 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0) |
[production] |
11:31 |
<dzahn@cumin1001> |
Updating IPMI password on 8 hosts - dzahn@cumin1001 |
[production] |
11:31 |
<urbanecm@deploy1001> |
Synchronized wmf-config/VariantSettings.php: SWAT: 290e207: Add channels for the Translate and TranslationsNotification extension (T221119, T144780, T143073) (duration: 00m 56s) |
[production] |
11:30 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.ipmi-password-reset |
[production] |