2351-2400 of 10000 results (72ms)
2019-08-19 §
10:32 <elukey@cumin1001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) [production]
10:22 <elukey@cumin1001> START - Cookbook sre.ganeti.makevm [production]
09:57 <jbond42> add mapped ipv6 to conf200* servers https://gerrit.wikimedia.org/r/c/operations/puppet/+/528475 [production]
09:26 <marostegui@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
09:24 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime [production]
08:57 <godog> add 100G to graphite1004 / graphite2003 /srv LVs [production]
07:59 <onimisionipe> shutdown elastic2050 to prepare for mgmt reset - T230597 [production]
07:40 <marostegui> Redact napwikisource on db1124 and db2094 - T210762 [production]
07:19 <moritzm> installing golang-1.11 security updates on buster [production]
07:08 <moritzm> installing ffmpeg security updates on buster [production]
06:37 <vgutierrez> upgrading acme-chief to version 0.20 on production servers - T229096 [production]
06:30 <vgutierrez@puppetmaster1001> conftool action : set/pooled=yes; selector: name=ncredir1001.eqiad.wmnet [production]
06:29 <vgutierrez@puppetmaster1001> conftool action : set/pooled=no; selector: name=ncredir1001.eqiad.wmnet [production]
06:28 <vgutierrez@puppetmaster1001> conftool action : set/pooled=yes; selector: name=ncredir1002.eqiad.wmnet [production]
06:27 <vgutierrez@puppetmaster1001> conftool action : set/pooled=no; selector: name=ncredir1002.eqiad.wmnet [production]
06:26 <moritzm> installing ghostscript security updates on scb/proton/notebook* hosts [production]
06:25 <vgutierrez@puppetmaster1001> conftool action : set/pooled=yes; selector: name=ncredir2001.codfw.wmnet [production]
06:25 <vgutierrez@puppetmaster1001> conftool action : set/pooled=no; selector: name=ncredir2001.codfw.wmnet [production]
06:24 <vgutierrez@puppetmaster1001> conftool action : set/pooled=yes; selector: name=ncredir2002.codfw.wmnet [production]
06:22 <vgutierrez@puppetmaster1001> conftool action : set/pooled=no; selector: name=ncredir2002.codfw.wmnet [production]
06:21 <vgutierrez> rolling upgrade of nginx in ncredir hosts [production]
06:03 <moritzm> installing php5 security updates [production]
05:51 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Remove db2067 from config T230705 (duration: 00m 47s) [production]
05:50 <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Remove db2067 from config T230705 (duration: 00m 50s) [production]
05:46 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2067, will be moved to m1 T230705', diff saved to https://phabricator.wikimedia.org/P8930 and previous config saved to /var/cache/conftool/dbconfig/20190819-054606-marostegui.json [production]
05:29 <elukey> reboot cp2004 due to bnx2x crash (kern.log saved into my home on the host if needed) [production]
2019-08-18 §
08:28 <onimisionipe> running `_cluster/reroute?pretty&explain=true&retry_failed` on eqiad production-search cluster to force allocation of shards [production]
2019-08-16 §
19:48 <sbassett> Deployed security patch for T230576 (ex:MobileFrontend) [production]
18:57 <@> helmfile [STAGING] Ran 'apply' command on namespace 'sessionstore' for release 'staging' . [production]
16:38 <XioNoX> add BGP sessions to Scaleway (AS12876) in esams [production]
16:12 <elukey> upload prometheus-druid-exporter 0.7-1 to stretch/buster-wikimedia [production]
15:42 <elukey> roll restart of druid broker/historicals to pick up new logging/metrics settings [production]
14:39 <onimisionipe> run `bmc-device --cold-reset; echo $?` in elastic2050 hoping it resets mgmt interface -T230597 [production]
14:24 <gehel> rolling reboot of cloudelastic [production]
13:52 <mholloway-shell@deploy1001> Synchronized wmf-config/InitialiseSettings-labs.php: MachineVision (beta): Request labels targeting Beta Wikidata (duration: 00m 50s) [production]
08:18 <_joe_> stopping php on phab1003, to restart it with systemd [production]
06:50 <_joe_> upgrading envoyproxy across production (http2 CVEs) [production]
02:51 <vgutierrez> repooling cp5002, running compress.so experiment [production]
2019-08-15 §
23:35 <smalyshev@deploy1001> Finished deploy [wdqs/wdqs@b4da6e4]: Rollback blazegraph due to T230588 (duration: 09m 48s) [production]
23:25 <smalyshev@deploy1001> Started deploy [wdqs/wdqs@b4da6e4]: Rollback blazegraph due to T230588 [production]
21:54 <smalyshev@deploy1001> Finished deploy [wdqs/wdqs@fce8177]: Weekly deploy (duration: 25m 28s) [production]
21:28 <smalyshev@deploy1001> Started deploy [wdqs/wdqs@fce8177]: Weekly deploy [production]
21:27 <ebernhardson> finish restarting cloudelastic-chi-eqiad with -XX:NewRatio=3 [production]
21:18 <ebernhardson> increase cloudelastic indices.recovery.max_bytes_per_sec from 40mbit to 512mbit as these have 10G networking [production]
21:07 <ebernhardson> restart cloudelastic1002 with -XX:NewRatio=3 to match cloudelastic1001 [production]
20:22 <gehel@cumin1001> END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) [production]
19:37 <ema> depool cp5002 during the EU night, running compress.so experiment [production]
19:28 <gehel@cumin1001> END (PASS) - Cookbook sre.wdqs.reboot-wdqs (exit_code=0) [production]
19:19 <sbassett> Deployed security patch for T230402 (1.34.0-wmf.17) [production]
19:18 <gehel@cumin1001> START - Cookbook sre.wdqs.data-transfer [production]