2020-05-22
§
|
10:00 |
<jbond42> |
update pdns-recursor on dns recursors |
[production] |
09:43 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
09:41 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
09:22 |
<jayme@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'mathoid' for release 'production' . |
[production] |
09:11 |
<elukey> |
superset upgrade attempt to 0.36 failed due to a db upgrade error (not seen in staging), rollback to 0.35.2 |
[analytics] |
09:09 |
<elukey@deploy1001> |
Finished deploy [analytics/superset/deploy@be203c8]: Rollback superset to 0.35.2 (duration: 00m 43s) |
[production] |
09:09 |
<elukey@deploy1001> |
Started deploy [analytics/superset/deploy@be203c8]: Rollback superset to 0.35.2 |
[production] |
08:41 |
<vgutierrez> |
reverting hugepages experiment on cp2041 |
[production] |
08:27 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Fully repool db1149 and db1081', diff saved to https://phabricator.wikimedia.org/P11278 and previous config saved to /var/cache/conftool/dbconfig/20200522-082700-marostegui.json |
[production] |
08:18 |
<elukey@deploy1001> |
Finished deploy [analytics/superset/deploy@59ba01d]: Upgrade Superset to 0.36 (duration: 01m 01s) |
[production] |
08:17 |
<elukey@deploy1001> |
Started deploy [analytics/superset/deploy@59ba01d]: Upgrade Superset to 0.36 |
[production] |
08:15 |
<elukey> |
superset down for maintenance |
[analytics] |
08:13 |
<vgutierrez> |
test hugepages allocator on ATS in cp2041 |
[production] |
08:06 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1149 and db1081', diff saved to https://phabricator.wikimedia.org/P11277 and previous config saved to /var/cache/conftool/dbconfig/20200522-080629-marostegui.json |
[production] |
07:51 |
<RhinosF1> |
failed to recover |
[tools.mhwikibot] |
07:48 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1149 and db1081', diff saved to https://phabricator.wikimedia.org/P11276 and previous config saved to /var/cache/conftool/dbconfig/20200522-074853-marostegui.json |
[production] |
07:41 |
<RhinosF1> |
TabError: inconsistent use of tabs and spaces in indentation |
[tools.mhwikibot] |
07:32 |
<RhinosF1> |
attempting recovery & check logs |
[tools.mhwikibot] |
07:20 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1149 and db1081', diff saved to https://phabricator.wikimedia.org/P11275 and previous config saved to /var/cache/conftool/dbconfig/20200522-072000-marostegui.json |
[production] |
07:15 |
<RhinosF1> |
outage known, looking soon |
[tools.mhwikibot] |
07:09 |
<elukey> |
add druid100[7,8] to the LVS druid-public-brokers service (serving AQS's traffic) |
[analytics] |
07:07 |
<elukey@puppetmaster1001> |
conftool action : set/pooled=yes:weight=10; selector: name=druid1008.eqiad.wmnet |
[production] |
07:04 |
<elukey@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=druid1007.eqiad.wmnet |
[production] |
07:04 |
<elukey@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=druid1007.eqiad.wmnet |
[production] |
04:34 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1081 - T252512', diff saved to https://phabricator.wikimedia.org/P11272 and previous config saved to /var/cache/conftool/dbconfig/20200522-043418-marostegui.json |
[production] |
00:33 |
<wm-bot> |
<bd808> Update to 7f1b260 public_html: update link to 'source' |
[tools.stashbot] |
2020-05-21
§
|
23:58 |
<ejegg> |
updated civicrm from b658fd8233 to 6b1d5902dd |
[production] |
23:54 |
<krinkle@deploy1001> |
Synchronized php-1.35.0-wmf.32/includes/content/ContentHandlerFactory.php: If578893f5689 (duration: 01m 06s) |
[production] |
23:47 |
<krinkle@deploy1001> |
Synchronized php-1.35.0-wmf.32/extensions/LiquidThreads/classes/Thread.php: If3418cba06e (duration: 01m 07s) |
[production] |
23:41 |
<krinkle@deploy1001> |
Synchronized wmf-config/mc.php: I222457729a5b (duration: 01m 08s) |
[production] |
23:04 |
<bstorm_> |
added profile::wmcs::kubeadm::k8s::encryption_key and profile::wmcs::kubeadm::k8s::node_token to labs/private T211096 |
[paws] |
22:40 |
<bd808> |
Rebuilding all Docker containers for tools-webservice 0.70 (T252700) |
[tools] |
22:36 |
<bd808> |
Updated tools-webservice to 0.70 across instances (T252700) |
[tools] |
22:29 |
<bd808> |
Building tools-webservice 0.70 via wmcs-package-build.py |
[tools] |
22:22 |
<ZI_Jony> |
quit SULWatcher and SULWatcher2 on #cvn-unifications |
[cvn] |
22:14 |
<bd808> |
Building tools-webservice 0.70 via wmcs-package-build.py |
[toolsbeta] |
21:46 |
<eileen> |
civicrm revision changed from ed4c9522ac to b658fd8233, config revision is 9babae3954 |
[production] |
21:25 |
<wm-bot> |
<maurelio> "webservice --backend=kubernetes --canonical python3.7 start" refs. T253346 |
[tools.ldap] |
21:24 |
<wm-bot> |
<maurelio> refs. T253346 |
[tools.ldap] |
21:20 |
<wm-bot> |
<maurelio> Stopping webservice for T253346 |
[tools.ldap] |
21:10 |
<foks> |
removing two files for legal compliance |
[production] |
20:44 |
<bstorm_> |
labstore1005 is now running stretch and drbd devices are resyncing after several reboots and some significant effort T224582 |
[production] |
19:23 |
<andrewbogott> |
disabling puppet on cloudbackup2001 to prevent the backup job from starting during maintenance |
[admin] |
19:16 |
<andrewbogott> |
systemctl disable block_sync-tools-project.service on cloudbackup2001.codfw.wmnet to avoid stepping on current upgrade |
[admin] |
18:24 |
<twentyafterfour> |
restarting phabricator on phab1001 to deploy https://phabricator.wikimedia.org/rPHEX2687d08786a9dadcbaa96709de991f471f239830 |
[production] |
17:24 |
<elukey> |
add druid100[7,8] to the druid public cluster (not serving load balancer traffic for the moment, only joining the cluster) - T252771 |
[analytics] |
17:24 |
<bblack> |
anycast experiment done, all back to normal |
[production] |
17:20 |
<bblack> |
anycast experimentation commencing in ulsfo (test route withdrawal)... |
[production] |
17:04 |
<bstorm_> |
starting labstore1005 upgrades T224582 |
[production] |
16:44 |
<elukey> |
roll restart druid historical nodes on druid100[4-6] (public cluster) to pick up new settings - T252771 |
[analytics] |