2021-03-18
ยง
|
16:37 |
<dzahn@cumin1001> |
conftool action : set/pooled=inactive; selector: name=mw2239.codfw.wmnet |
[production] |
16:24 |
<arturo> |
live-hacking puppetmaster with https://gerrit.wikimedia.org/r/c/operations/puppet/+/672456 |
[toolsbeta] |
16:21 |
<andrewbogott> |
enabling puppet tools-wide |
[tools] |
16:20 |
<andrewbogott> |
disabling puppet tools-wide to test https://gerrit.wikimedia.org/r/c/operations/puppet/+/672456 |
[tools] |
16:19 |
<bstorm> |
added profile::toolforge::infrastructure class to puppetmaster T277756 |
[tools] |
16:10 |
<addshore> |
reload zuul for https://gerrit.wikimedia.org/r/673208 and https://gerrit.wikimedia.org/r/673211 T277750 (apitests php versions) |
[releng] |
15:55 |
<addshore> |
reload zuul for Introduce query-builder job so it can use npm 6.14.* instead [integration/config] - https://gerrit.wikimedia.org/r/673183 T277060 |
[releng] |
15:52 |
<hashar> |
Disconnecting a bunch of Jenkins agents to upgrade them to Java 11 # T269354 |
[releng] |
15:33 |
<shdubsh> |
clean up dead letter queue and restart all logstashes |
[production] |
14:50 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
14:43 |
<cmjohnson@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
14:37 |
<dcausse> |
repooling wdqs1005 |
[production] |
14:29 |
<hashar> |
Restarting CI Jenkins for plugin upgrade |
[production] |
13:49 |
<elukey> |
reboot analytics1066 |
[production] |
13:23 |
<ladsgroup@deploy1002> |
Synchronized php-1.36.0-wmf.35/extensions/Wikibase/repo: [[gerrit:673108|languageLabelDescriptionAliases: use getLanguageNameByCode]] (T275611 T277722) (duration: 01m 14s) |
[production] |
13:20 |
<Majavah> |
manually systemctl daemon-reload && systemctl start srv-swift\\x2dstorage-lv\\x2da1.mount on deployment-ms-be* nodes for T276179 |
[releng] |
12:58 |
<jbond42> |
upload cas_6.3.2 to apt buster-wikimedia |
[production] |
12:53 |
<arturo> |
create anti-affinity server group toolsbeta-sgegrid-master-shadow |
[toolsbeta] |
12:51 |
<arturo> |
rebuild toolsbeta-sgegrid-shadow instance as debian buster (T277653) |
[toolsbeta] |
12:50 |
<arturo> |
added puppet prefix `toolsbeta-sgegrid-shadow`, migrate puppet config from VM to here |
[toolsbeta] |
12:48 |
<arturo> |
destroy VM toolsbeta-buster-gridmaster (no longer useful) T277653 |
[toolsbeta] |
12:47 |
<arturo> |
delete puppet prefix `toolsbeta-buster-grirdmaster` (no longer useful) T277653 |
[toolsbeta] |
11:37 |
<mvolz@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'zotero' for release 'production' . |
[production] |
11:34 |
<mvolz@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'zotero' for release 'production' . |
[production] |
11:25 |
<mvolz@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'zotero' for release 'staging' . |
[production] |
11:24 |
<urbanecm@deploy1002> |
Synchronized wmf-config/flaggedrevs.php: 896c9f019b17d1ad3a1589d377158ca2fb91ebaa: flaggedrevs: Disable multiple dimensions in hewikisource (duration: 01m 09s) |
[production] |
11:20 |
<urbanecm@deploy1002> |
Synchronized php-1.36.0-wmf.35/extensions/GrowthExperiments/includes/HomepageHooks.php: 3b2aa1aa28e9d204f32ae937a84ec211137cbb2e: Remove variant C from list of valid variants (T277727) (duration: 01m 09s) |
[production] |
11:16 |
<mvolz@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' . |
[production] |
11:14 |
<mvolz@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' . |
[production] |
11:11 |
<mvolz@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'citoid' for release 'staging' . |
[production] |
11:11 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: 0005676e704cad907655a4a0bca7bd2164714b1c: GrowthExperiments: set $wgGEHomepageNewAccountVariants to D only (T277727) (duration: 01m 10s) |
[production] |
11:08 |
<urbanecm@deploy1002> |
Synchronized wmf-config/CommonSettings.php: NOOP: e7f5eac: Enable CentralAuth IRC feed in beta cluster (T277432) (duration: 01m 12s) |
[production] |
09:13 |
<_joe_> |
hard reboot of snapshot1005 |
[production] |
09:10 |
<addshore> |
reload zuul for Remove mwselenium-quibble-docker [integration/config] - https://gerrit.wikimedia.org/r/673206 |
[releng] |
09:04 |
<_joe_> |
attempted reboot of snapshot1005, read-only filesystem and probably disks are broken beyond repair |
[production] |
08:44 |
<Majavah> |
delete now unused deployment-ircd T277081 |
[releng] |
08:40 |
<Majavah> |
delete deployment-db06, 07/08 have been working fine for a week now |
[releng] |
08:27 |
<godog> |
swift eqiad-prod: less weight for ms-be[1019-1026] - T272836 |
[production] |
08:18 |
<akosiaris@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1004.eqiad.wmnet with reason: REIMAGE |
[production] |
08:16 |
<akosiaris@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1004.eqiad.wmnet with reason: REIMAGE |
[production] |
08:03 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1126 (re)pooling @ 100%: Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P14946 and previous config saved to /var/cache/conftool/dbconfig/20210318-080258-root.json |
[production] |
08:02 |
<akosiaris> |
reimage ml-serve1004 to debug a docker volume_group issue |
[production] |
07:47 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1126 (re)pooling @ 75%: Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P14945 and previous config saved to /var/cache/conftool/dbconfig/20210318-074754-root.json |
[production] |
07:32 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1126 (re)pooling @ 50%: Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P14944 and previous config saved to /var/cache/conftool/dbconfig/20210318-073250-root.json |
[production] |
07:20 |
<dcausse> |
depooling & restarting blazegraph on wdqs1005 |
[production] |
07:19 |
<marostegui> |
Deploy schema change on s4 codfw master, lag will appear - T276150 T276156 |
[production] |
07:17 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1126 (re)pooling @ 25%: Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P14943 and previous config saved to /var/cache/conftool/dbconfig/20210318-071747-root.json |
[production] |
07:15 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: REIMAGE |
[production] |
07:13 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: REIMAGE |
[production] |
06:32 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Add db1161 to dbctl, depooled T258361', diff saved to https://phabricator.wikimedia.org/P14942 and previous config saved to /var/cache/conftool/dbconfig/20210318-063241-marostegui.json |
[production] |