2019-10-01
§
|
16:10 |
<@> |
helmfile [EQIAD] Ran 'sync' command on namespace 'restrouter' for release 'production' . |
[production] |
16:06 |
<@> |
helmfile [STAGING] Ran 'sync' command on namespace 'restrouter' for release 'staging' . |
[production] |
15:36 |
<_joe_> |
uninstalling temporarily the math rendering related packages from mwdebug2002, test for T195847 |
[production] |
15:36 |
<elukey> |
powercycle an-conf1001 to test some bios settings |
[production] |
15:12 |
<jbond42> |
puppetmaster2001 is back online |
[production] |
14:33 |
<dcausse> |
created cirrussearch indices for nqowiki (T234326) |
[production] |
14:18 |
<moritzm> |
rebooting krb1001 for some tests |
[production] |
14:17 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
14:17 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
14:10 |
<hashar> |
Restarting CI Jenkins |
[production] |
14:08 |
<cdanis> |
✔️ cdanis@puppetmaster2001.codfw.wmnet ~ 🕙☕ (cd /var/lib/git/labs/private ; git rev-parse HEAD | sudo tee /srv/config-master/labsprivate-sha1.txt ) |
[production] |
14:08 |
<cdanis> |
✔️ cdanis@puppetmaster2001.codfw.wmnet ~ 🕙☕ (cd /var/lib/git/operations/puppet ; git rev-parse HEAD | sudo tee /srv/config-master/puppet-sha1.txt ) |
[production] |
14:08 |
<herron> |
beginning rolling reboots of eqiad and codfw logstash collectors |
[production] |
14:02 |
<moritzm> |
rebooting mw1265 for some tests |
[production] |
14:01 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
14:01 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
13:59 |
<cdanis> |
✔️ cdanis@puppetmaster2001.codfw.wmnet ~ 🕙☕ sudo touch /srv/config-master/puppet-sha1.txt /srv/config-master/labsprivate-sha1.txt && sudo chown gitpuppet:gitpuppet /srv/config-master/puppet-sha1.txt /srv/config-master/labsprivate-sha1.txt |
[production] |
13:42 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
13:40 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
13:24 |
<jbond42> |
reimage puppetmaster2001 |
[production] |
12:37 |
<hashar> |
Gerrit misbehaved temporarily due to human operator error (hashar ran jstack -l -m which bring the jvm to an halt) |
[production] |
11:16 |
<jbond42> |
update puppet.ulsfo.wmnet to point to puppetmaster1001 |
[production] |
10:45 |
<jbond42> |
update puppet.esqin.wmnet to point to puppetmaster1001 |
[production] |
10:17 |
<moritzm> |
upgrading ferm on remaining mw servers 2.4.2pre T153468 |
[production] |
09:35 |
<moritzm> |
run systemctl reset-failed on puppetmaster2002 to clear failed puppet-master.service |
[production] |
09:19 |
<moritzm> |
upgrading ferm on a number of systems to 2.4.2pre T153468 |
[production] |
09:07 |
<vgutierrez> |
restarting acme-chief on acmechief1001 to catch up with python3-cryptography upgrades - T234131 |
[production] |
09:04 |
<vgutierrez> |
upgrading python3-cryptography to version 2.6.1-3+deb10u1~wmf1 on acme-chief hosts - T234131 |
[production] |
09:03 |
<moritzm> |
rebalancing ganeti/row_B after rolling reboot |
[production] |
08:57 |
<vgutierrez> |
upgrading python3-cryptography to version 2.6.1-3+deb10u1~wmf1 on acmechief-test1001 - T234131 |
[production] |
08:41 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
08:41 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
08:00 |
<moritzm> |
draining ganeti2003 for upcoming reboot (combined kernel/qemu security updates) |
[production] |
07:00 |
<hashar> |
gerrit: forcing reindex of changes # T233989 |
[production] |
06:29 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) |
[production] |
06:29 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
06:28 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) |
[production] |
06:28 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
06:19 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db2091:3314 schema change - T233625', diff saved to https://phabricator.wikimedia.org/P9223 and previous config saved to /var/cache/conftool/dbconfig/20191001-061956-marostegui.json |
[production] |
05:12 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
05:12 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
00:12 |
<mutante> |
phabricator - upgrading PHP version to 7.2.22 - T230024 |
[production] |
2019-09-30
§
|
23:28 |
<niharika29@deploy1001> |
Synchronized php-1.34.0-wmf.24/extensions/CentralNotice/resources/infrastructure/: CentralNotice: Replace deprecated editToken with csrfToken - T233538 (duration: 00m 57s) |
[production] |
23:23 |
<AndyRussG> |
updated fruec from c591bd653b to 18d89675d0 |
[production] |
21:48 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw1290.eqiad.wmnet |
[production] |
21:47 |
<mutante> |
mw1290 - scap pull to get it in sync with latest deployment - it was down during scap run for T234153 |
[production] |
21:42 |
<jforrester@deploy1001> |
Synchronized robots.txt: Remove old InternetArchive bot rule that's been disabled since 2008 T7582 (duration: 00m 57s) |
[production] |
21:40 |
<jforrester@deploy1001> |
Synchronized wmf-config/CommonSettings.php: T222539 Drop no-op hacky disablement of MessageBlobStore::clear() (duration: 05m 13s) |
[production] |
21:38 |
<James_F> |
sync failure on mw1290.eqiad.wmnet – Connection timed out |
[production] |
21:26 |
<mutante> |
mw1290 - downtimed for onsite work on mgmt, depooled earlier |
[production] |