5401-5450 of 10000 results (93ms)
2019-08-19 §
11:00 <jmm@cumin2001> START - Cookbook sre.ganeti.makevm [production]
10:53 <elukey@cumin1001> START - Cookbook sre.ganeti.makevm [production]
10:53 <elukey@cumin1001> END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97) [production]
10:52 <elukey@cumin1001> START - Cookbook sre.ganeti.makevm [production]
10:38 <jdrewniak@deploy1001> Synchronized portals: Wikimedia Portals Update: [[gerrit:530826| Bumping portals to master (T128546)]] (duration: 00m 49s) [production]
10:37 <jdrewniak@deploy1001> Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:530826| Bumping portals to master (T128546)]] (duration: 00m 49s) [production]
10:32 <elukey@cumin1001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) [production]
10:22 <elukey@cumin1001> START - Cookbook sre.ganeti.makevm [production]
09:57 <jbond42> add mapped ipv6 to conf200* servers https://gerrit.wikimedia.org/r/c/operations/puppet/+/528475 [production]
09:26 <marostegui@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
09:24 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime [production]
08:57 <godog> add 100G to graphite1004 / graphite2003 /srv LVs [production]
07:59 <onimisionipe> shutdown elastic2050 to prepare for mgmt reset - T230597 [production]
07:40 <marostegui> Redact napwikisource on db1124 and db2094 - T210762 [production]
07:19 <moritzm> installing golang-1.11 security updates on buster [production]
07:08 <moritzm> installing ffmpeg security updates on buster [production]
06:37 <vgutierrez> upgrading acme-chief to version 0.20 on production servers - T229096 [production]
06:30 <vgutierrez@puppetmaster1001> conftool action : set/pooled=yes; selector: name=ncredir1001.eqiad.wmnet [production]
06:29 <vgutierrez@puppetmaster1001> conftool action : set/pooled=no; selector: name=ncredir1001.eqiad.wmnet [production]
06:28 <vgutierrez@puppetmaster1001> conftool action : set/pooled=yes; selector: name=ncredir1002.eqiad.wmnet [production]
06:27 <vgutierrez@puppetmaster1001> conftool action : set/pooled=no; selector: name=ncredir1002.eqiad.wmnet [production]
06:26 <moritzm> installing ghostscript security updates on scb/proton/notebook* hosts [production]
06:25 <vgutierrez@puppetmaster1001> conftool action : set/pooled=yes; selector: name=ncredir2001.codfw.wmnet [production]
06:25 <vgutierrez@puppetmaster1001> conftool action : set/pooled=no; selector: name=ncredir2001.codfw.wmnet [production]
06:24 <vgutierrez@puppetmaster1001> conftool action : set/pooled=yes; selector: name=ncredir2002.codfw.wmnet [production]
06:22 <vgutierrez@puppetmaster1001> conftool action : set/pooled=no; selector: name=ncredir2002.codfw.wmnet [production]
06:21 <vgutierrez> rolling upgrade of nginx in ncredir hosts [production]
06:03 <moritzm> installing php5 security updates [production]
05:51 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Remove db2067 from config T230705 (duration: 00m 47s) [production]
05:50 <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Remove db2067 from config T230705 (duration: 00m 50s) [production]
05:46 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2067, will be moved to m1 T230705', diff saved to https://phabricator.wikimedia.org/P8930 and previous config saved to /var/cache/conftool/dbconfig/20190819-054606-marostegui.json [production]
05:29 <elukey> reboot cp2004 due to bnx2x crash (kern.log saved into my home on the host if needed) [production]
2019-08-18 §
10:39 <arturo> rebooting cloudvirt1023 for new interface names configuration [admin]
10:34 <arturo> downtimed cloudvirt1023 for 2 days [admin]
08:57 <arturo> restart shinken service. Was in failure state, and is working after that [shinken]
08:41 <arturo> add myself as projectadmin to investigate shinken issues [shinken]
08:28 <onimisionipe> running `_cluster/reroute?pretty&explain=true&retry_failed` on eqiad production-search cluster to force allocation of shards [production]
08:11 <arturo> restart maintain-kuberusers service in tools-k8s-master-01 [tools]
2019-08-17 §
19:17 <wm-bot> <lucaswerkmeister> deployed abde3331a2 (improved visibility) [tools.pagepile-visual-filter]
11:49 <wm-bot> <lucaswerkmeister> deployed dc25780163 (crash on bad POST) [tools.pagepile-visual-filter]
10:57 <wm-bot> <lucaswerkmeister> deployed 58cb5a6b77 (rm OAuth, trim requirements, more explanation) and rebuilt venv [tools.pagepile-visual-filter]
10:56 <arturo> force-reboot tools-worker-1006. Is completely stuck [tools]
2019-08-16 §
19:48 <sbassett> Deployed security patch for T230576 (ex:MobileFrontend) [production]
19:31 <wm-bot> <maurelio> Restarting SULWatchers. [tools.stewardbots]
19:17 <Cam11598> restarted CVNBot3 [cvn]
18:57 <@> helmfile [STAGING] Ran 'apply' command on namespace 'sessionstore' for release 'staging' . [production]
16:38 <XioNoX> add BGP sessions to Scaleway (AS12876) in esams [production]
16:12 <elukey> upload prometheus-druid-exporter 0.7-1 to stretch/buster-wikimedia [production]
15:42 <elukey> roll restart of druid broker/historicals to pick up new logging/metrics settings [production]
14:41 <wm-bot> <maurelio> Re-enable archiving for MABot after bugfix. [tools.mabot]