2025-04-15
§
|
07:28 |
<Emperor> |
make sure all disks are mounted correctly prior to disk-swap testing T391854 ms-be1091 |
[production] |
07:28 |
<Emperor> |
make sure all disks are mounted correctly prior to disk-swap testing T391854 |
[production] |
07:10 |
<elukey@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ms-be1091.eqiad.wmnet with reason: dcops maintenance |
[production] |
07:06 |
<vgutierrez@cumin1002> |
START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload_codfw |
[production] |
07:06 |
<vgutierrez@cumin1002> |
START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-text_codfw |
[production] |
07:06 |
<vgutierrez@cumin1002> |
START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload_eqsin |
[production] |
07:05 |
<kartik@deploy1003> |
helmfile [staging] DONE helmfile.d/services/machinetranslation: apply |
[production] |
07:05 |
<vgutierrez@cumin1002> |
START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-text_eqsin |
[production] |
07:04 |
<vgutierrez> |
rolling upgrade to varnish 7.1.1-1.1~bpo11+wmf3 in eqsin and codfw - T391334 |
[production] |
06:50 |
<kartik@deploy1003> |
helmfile [staging] START helmfile.d/services/machinetranslation: apply |
[production] |
06:48 |
<kart_> |
Updated cxserver to 2025-04-07-053106-production (T390732, T390711) |
[production] |
06:48 |
<kartik@deploy1003> |
helmfile [eqiad] DONE helmfile.d/services/cxserver: apply |
[production] |
06:47 |
<kartik@deploy1003> |
helmfile [eqiad] START helmfile.d/services/cxserver: apply |
[production] |
06:46 |
<kartik@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/cxserver: apply |
[production] |
06:45 |
<kartik@deploy1003> |
helmfile [codfw] START helmfile.d/services/cxserver: apply |
[production] |
06:45 |
<kartik@deploy1003> |
helmfile [staging] DONE helmfile.d/services/cxserver: apply |
[production] |
06:44 |
<kartik@deploy1003> |
helmfile [staging] START helmfile.d/services/cxserver: apply |
[production] |
05:03 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repool pc6 T391454', diff saved to https://phabricator.wikimedia.org/P75003 and previous config saved to /var/cache/conftool/dbconfig/20250415-050307-marostegui.json |
[production] |
04:57 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2016.codfw.wmnet,pc1016.eqiad.wmnet with reason: Maintenance |
[production] |
04:57 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool pc6 T391454', diff saved to https://phabricator.wikimedia.org/P75002 and previous config saved to /var/cache/conftool/dbconfig/20250415-045700-marostegui.json |
[production] |
04:10 |
<mwpresync@deploy1003> |
Pruned MediaWiki: 1.44.0-wmf.22 (duration: 10m 03s) |
[production] |
03:43 |
<mwpresync@deploy1003> |
sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.44.0-wmf.24,1.44.0-wmf.25 --multiversion-image-name docker-registry.discovery.wmnet/restricted/mediawiki-multiversion --multiversion-debug-image-name docker-registry.discov |
[production] |
03:02 |
<mwpresync@deploy1003> |
Started scap sync-world: testwikis to 1.44.0-wmf.25 refs T386220 |
[production] |
02:32 |
<ejegg> |
payments-wiki upgraded from ef9284aa to ba6e8d65 |
[production] |
02:06 |
<jhancock@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1181.eqiad.wmnet with OS bullseye |
[production] |
01:32 |
<jhancock@cumin2002> |
START - Cookbook sre.hosts.reimage for host an-worker1181.eqiad.wmnet with OS bullseye |
[production] |
01:31 |
<jhancock@cumin2002> |
END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['an-worker1181'] |
[production] |
01:30 |
<jhancock@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['an-worker1181'] |
[production] |
01:24 |
<jhancock@cumin2002> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host an-worker1181.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL |
[production] |
01:03 |
<jhancock@cumin2002> |
START - Cookbook sre.hosts.provision for host an-worker1181.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL |
[production] |
2025-04-14
§
|
23:22 |
<urandom> |
bootstrapping Cassandra/restbase1044-b — T389423 |
[production] |
23:12 |
<zabe> |
zabe@mwmaint1002:~$ cat group2.dblist | xargs -I{} bash -c "echo {}; mwscript extensions/WikimediaMaintenance/migrateESRefToContentTableStage2.php {} --delete /home/zabe/afl_text_table_deletedump/{} --sleep 0.3" # T381599 |
[production] |
22:44 |
<ladsgroup@dns1004> |
END - running authdns-update |
[production] |
22:42 |
<ladsgroup@dns1004> |
START - running authdns-update |
[production] |
22:34 |
<mutante> |
deploy1003 - scap install-world -l release2003.codfw.wmnet T391590 |
[production] |
22:34 |
<dzahn@deploy1003> |
Installation of scap version "4.153.0" completed for 1 hosts |
[production] |
22:33 |
<dzahn@deploy1003> |
Installing scap version "4.153.0" for 1 host(s) |
[production] |
22:30 |
<sbassett> |
Deployed previous good versions of affected files for T391343 |
[production] |
22:25 |
<fceratto@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2239.codfw.wmnet with reason: Maintenance |
[production] |
22:25 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2227 (T391056)', diff saved to https://phabricator.wikimedia.org/P75001 and previous config saved to /var/cache/conftool/dbconfig/20250414-222519-fceratto.json |
[production] |
22:20 |
<sbassett> |
Deployment of security patch for T391343 halted |
[production] |
22:10 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P75000 and previous config saved to /var/cache/conftool/dbconfig/20250414-221012-fceratto.json |
[production] |
22:06 |
<ryankemper@cumin2002> |
conftool action : set/pooled=yes:weight=10; selector: name=cirrussearch2060.codfw.wmnet|cirrussearch2067.codfw.wmnet|cirrussearch2068.codfw.wmnet|cirrussearch2072.codfw.wmnet|cirrussearch2085.codfw.wmnet|cirrussearch2104.codfw.wmnet|cirrussearch2105.codfw.wmnet|cirrussearch2107.codfw.wmnet|cirrussearch2109.codfw.wmnet|cirrussearch2114.codfw.wmnet|cirrussearch2115.codfw.wmnet |
[production] |
21:55 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P74999 and previous config saved to /var/cache/conftool/dbconfig/20250414-215504-fceratto.json |
[production] |
21:39 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2227 (T391056)', diff saved to https://phabricator.wikimedia.org/P74998 and previous config saved to /var/cache/conftool/dbconfig/20250414-213957-fceratto.json |
[production] |
21:23 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Depooling db2227 (T391056)', diff saved to https://phabricator.wikimedia.org/P74997 and previous config saved to /var/cache/conftool/dbconfig/20250414-212344-fceratto.json |
[production] |
21:23 |
<fceratto@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2227.codfw.wmnet with reason: Maintenance |
[production] |
21:23 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2205 (T391056)', diff saved to https://phabricator.wikimedia.org/P74996 and previous config saved to /var/cache/conftool/dbconfig/20250414-212320-fceratto.json |
[production] |
21:08 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P74995 and previous config saved to /var/cache/conftool/dbconfig/20250414-210814-fceratto.json |
[production] |
21:06 |
<mforns> |
re-running Commons impact metrics cassandra loading for all top endpoints since the beginning of Commons Impact Metrics time. |
[analytics] |