2951-3000 of 10000 results (66ms)
2019-10-01 §
14:17 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
14:10 <hashar> Restarting CI Jenkins [production]
14:08 <cdanis> ✔️ cdanis@puppetmaster2001.codfw.wmnet ~ 🕙☕ (cd /var/lib/git/labs/private ; git rev-parse HEAD | sudo tee /srv/config-master/labsprivate-sha1.txt ) [production]
14:08 <cdanis> ✔️ cdanis@puppetmaster2001.codfw.wmnet ~ 🕙☕ (cd /var/lib/git/operations/puppet ; git rev-parse HEAD | sudo tee /srv/config-master/puppet-sha1.txt ) [production]
14:08 <herron> beginning rolling reboots of eqiad and codfw logstash collectors [production]
14:02 <moritzm> rebooting mw1265 for some tests [production]
14:01 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
14:01 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
13:59 <cdanis> ✔️ cdanis@puppetmaster2001.codfw.wmnet ~ 🕙☕ sudo touch /srv/config-master/puppet-sha1.txt /srv/config-master/labsprivate-sha1.txt && sudo chown gitpuppet:gitpuppet /srv/config-master/puppet-sha1.txt /srv/config-master/labsprivate-sha1.txt [production]
13:42 <jbond@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
13:40 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
13:24 <jbond42> reimage puppetmaster2001 [production]
12:37 <hashar> Gerrit misbehaved temporarily due to human operator error (hashar ran jstack -l -m which bring the jvm to an halt) [production]
11:16 <jbond42> update puppet.ulsfo.wmnet to point to puppetmaster1001 [production]
10:45 <jbond42> update puppet.esqin.wmnet to point to puppetmaster1001 [production]
10:17 <moritzm> upgrading ferm on remaining mw servers 2.4.2pre T153468 [production]
09:35 <moritzm> run systemctl reset-failed on puppetmaster2002 to clear failed puppet-master.service [production]
09:19 <moritzm> upgrading ferm on a number of systems to 2.4.2pre T153468 [production]
09:07 <vgutierrez> restarting acme-chief on acmechief1001 to catch up with python3-cryptography upgrades - T234131 [production]
09:04 <vgutierrez> upgrading python3-cryptography to version 2.6.1-3+deb10u1~wmf1 on acme-chief hosts - T234131 [production]
09:03 <moritzm> rebalancing ganeti/row_B after rolling reboot [production]
08:57 <vgutierrez> upgrading python3-cryptography to version 2.6.1-3+deb10u1~wmf1 on acmechief-test1001 - T234131 [production]
08:41 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
08:41 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
08:00 <moritzm> draining ganeti2003 for upcoming reboot (combined kernel/qemu security updates) [production]
07:00 <hashar> gerrit: forcing reindex of changes # T233989 [production]
06:29 <elukey@cumin1001> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) [production]
06:29 <elukey@cumin1001> START - Cookbook sre.hosts.decommission [production]
06:28 <elukey@cumin1001> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) [production]
06:28 <elukey@cumin1001> START - Cookbook sre.hosts.decommission [production]
06:19 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db2091:3314 schema change - T233625', diff saved to https://phabricator.wikimedia.org/P9223 and previous config saved to /var/cache/conftool/dbconfig/20191001-061956-marostegui.json [production]
05:12 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
05:12 <marostegui@cumin1001> START - Cookbook sre.hosts.decommission [production]
00:12 <mutante> phabricator - upgrading PHP version to 7.2.22 - T230024 [production]
2019-09-30 §
23:28 <niharika29@deploy1001> Synchronized php-1.34.0-wmf.24/extensions/CentralNotice/resources/infrastructure/: CentralNotice: Replace deprecated editToken with csrfToken - T233538 (duration: 00m 57s) [production]
23:23 <AndyRussG> updated fruec from c591bd653b to 18d89675d0 [production]
21:48 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw1290.eqiad.wmnet [production]
21:47 <mutante> mw1290 - scap pull to get it in sync with latest deployment - it was down during scap run for T234153 [production]
21:42 <jforrester@deploy1001> Synchronized robots.txt: Remove old InternetArchive bot rule that's been disabled since 2008 T7582 (duration: 00m 57s) [production]
21:40 <jforrester@deploy1001> Synchronized wmf-config/CommonSettings.php: T222539 Drop no-op hacky disablement of MessageBlobStore::clear() (duration: 05m 13s) [production]
21:38 <James_F> sync failure on mw1290.eqiad.wmnet – Connection timed out [production]
21:26 <mutante> mw1290 - downtimed for onsite work on mgmt, depooled earlier [production]
21:09 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw1290.eqiad.wmnet [production]
21:08 <XioNoX> delete BGP to AS131285 on cr1-eqsin [production]
20:43 <arlolra> Updated Parsoid to 1922eb6 (T233459, T230359, T208070) [production]
20:43 <arlolra> T208070 [production]
20:34 <arlolra@deploy1001> Finished deploy [parsoid/deploy@a6da34c]: Updating Parsoid to 1922eb6 (duration: 08m 39s) [production]
20:25 <arlolra@deploy1001> Started deploy [parsoid/deploy@a6da34c]: Updating Parsoid to 1922eb6 [production]
20:06 <mholloway-shell@deploy1001> Finished deploy [mobileapps/deploy@1f9fedd]: Update mobileapps to 131b83f (duration: 05m 55s) [production]
20:00 <mholloway-shell@deploy1001> Started deploy [mobileapps/deploy@1f9fedd]: Update mobileapps to 131b83f [production]