2021-04-08
ยง
|
21:38 |
<andrew@deploy1002> |
Finished deploy [horizon/deploy@3abe9d0]: Fix for T279667 (duration: 03m 52s) |
[production] |
21:34 |
<andrew@deploy1002> |
Started deploy [horizon/deploy@3abe9d0]: Fix for T279667 |
[production] |
21:33 |
<tgr@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' . |
[production] |
20:33 |
<mutante> |
mw2403 through mw2411 pooled and set to active state in netbox (T279599) |
[production] |
20:32 |
<mutante> |
mw2304 through mw2411 - pooled and set to active state in netbox (T279599) |
[production] |
20:30 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw240[3-9].codfw.wmnet |
[production] |
20:28 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw241[0-1].codfw.wmnet |
[production] |
20:27 |
<legoktm> |
legoktm@deploy1002:~$ cat deb-parsoid-urls.txt | mwscript purgeList.php --wiki=aawiki # to clear releases.wm.o/debian/ cache |
[production] |
20:02 |
<legoktm> |
imported parsoid_0.11.1all_all.deb to releases.wikimedia.org apt repo |
[production] |
19:58 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw241[0-1].codfw.wmnet |
[production] |
19:58 |
<dzahn@cumin1001> |
conftool action : set/weight=10; selector: name=mw241[0-1].codfw.wmnet |
[production] |
19:57 |
<dzahn@cumin1001> |
conftool action : set/weight=10; selector: name=mw238[0-2].codfw.wmnet |
[production] |
19:56 |
<dzahn@cumin1001> |
conftool action : set/weight=10; selector: name=mw2379.codfw.wmnet |
[production] |
19:55 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw240[3-9].codfw.wmnet |
[production] |
19:54 |
<dzahn@cumin1001> |
conftool action : set/weight=30; selector: name=mw240[3-9].codfw.wmnet |
[production] |
19:50 |
<mutante> |
mw2403 through mw2411 - scap pull - new hardware |
[production] |
19:35 |
<dduvall@deploy1002> |
rebuilt and synchronized wikiversions files: all wikis to 1.36.0-wmf.38 |
[production] |
18:52 |
<phuedx> |
phuedx@deploy1002 Synchronized private/PrivateSettings.php: PrivateSettings: Add value for $wgWMEVectorPrefDiffSalt (T261842) |
[production] |
18:51 |
<phuedx@deploy1002> |
Synchronized private/PrivateSettings.php: PrivateSettings: Add value for (T261842) (duration: 01m 06s) |
[production] |
18:37 |
<mutante> |
mw2403 through mw2411 - serial rebooting |
[production] |
18:31 |
<tgr@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' . |
[production] |
18:31 |
<tgr@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' . |
[production] |
18:29 |
<urbanecm@deploy1002> |
Synchronized php-1.36.0-wmf.38/extensions/VisualEditor/modules/ve-mw/ui/tools/ve.ui.MWBackTool.js: e0f3735f6a31d2914bae6c9daac1267707a2d108: Revert incorrect changes to ve.ui.MWBackCommand that made it stop working (T279613) (duration: 01m 07s) |
[production] |
18:25 |
<tgr@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' . |
[production] |
18:25 |
<tgr@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' . |
[production] |
18:23 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw[2410-2411].codfw.wmnet with reason: new_install |
[production] |
18:23 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw[2410-2411].codfw.wmnet with reason: new_install |
[production] |
18:22 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 7 hosts with reason: new_install |
[production] |
18:22 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on 7 hosts with reason: new_install |
[production] |
18:03 |
<mutante> |
mw2403 through mw2411 - new hardware moving into production, not pooled yet, initial puppet run, being added to icinga etc, creating mcrouter certs for them (T279599) |
[production] |
18:02 |
<mutante> |
mw2403 through mw2401 - new hardwere moving into production, not pooled yet, initial puppet run, being added to icinga etc, creating mcrouter certs for them (T279599) |
[production] |
17:59 |
<tgr@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' . |
[production] |
17:52 |
<ryankemper@cumin2001> |
END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) |
[production] |
17:29 |
<jgiannelos@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'proton' for release 'production' . |
[production] |
17:23 |
<jgiannelos@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'proton' for release 'production' . |
[production] |
17:18 |
<jgiannelos@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' . |
[production] |
17:16 |
<dancy> |
Scap 3.17.0 deployed to beta cluster |
[production] |
16:51 |
<dancy> |
testing Scap 3.17.0 release on deployment-deploy01 |
[production] |
16:33 |
<elukey> |
reboot an-worker1100 again to check if all the disks come up correctly |
[production] |
16:16 |
<cmjohnson1> |
update bios cp1087, already deposed for h/w issues T278729 |
[production] |
16:15 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1025.eqiad.wmnet with reason: REIMAGE |
[production] |
16:13 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1025.eqiad.wmnet with reason: REIMAGE |
[production] |
16:10 |
<pt1979@cumin2001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
16:05 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
15:51 |
<pt1979@cumin2001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
15:44 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
15:36 |
<elukey> |
reboot an-worker1100 to see if it helps with the strange BBU behavior |
[production] |
13:55 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcephmon2001-dev.codfw.wmnet |
[production] |
13:44 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.decommission for hosts cloudcephmon2001-dev.codfw.wmnet |
[production] |
13:41 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2001.codfw.wmnet with reason: REIMAGE |
[production] |