1701-1750 of 10000 results (75ms)
2023-09-28 ยง
16:41 <fabfur@cumin1001> END (ERROR) - Cookbook sre.cdn.roll-restart-varnish (exit_code=97) rolling restart of Varnish on 7 hosts matching query A:cp-upload_codfw and not P{cp2028*} [production]
16:41 <fabfur@cumin1001> START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 7 hosts matching query A:cp-upload_codfw and not P{cp2028*} [production]
16:41 <jgiannelos@deploy2002> helmfile [staging] DONE helmfile.d/services/wikifeeds: apply [production]
16:41 <jgiannelos@deploy2002> helmfile [staging] START helmfile.d/services/wikifeeds: apply [production]
16:39 <jgiannelos@deploy2002> helmfile [staging] DONE helmfile.d/services/wikifeeds: apply [production]
16:39 <jgiannelos@deploy2002> helmfile [staging] START helmfile.d/services/wikifeeds: apply [production]
16:37 <btullis@cumin1001> START - Cookbook sre.hosts.reboot-single for host eventlog1003.eqiad.wmnet [production]
16:35 <fabfur@cumin1001> END (ERROR) - Cookbook sre.cdn.roll-restart-varnish (exit_code=97) rolling restart of Varnish on 8 hosts matching query A:cp-upload_codfw [production]
16:35 <fabfur@cumin1001> END (ERROR) - Cookbook sre.cdn.roll-restart-varnish (exit_code=97) rolling restart of Varnish on 8 hosts matching query A:cp-text_codfw [production]
16:26 <joal@deploy2002> helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply [production]
16:26 <joal@deploy2002> helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply [production]
16:23 <fabfur@cumin1001> START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 8 hosts matching query A:cp-upload_codfw [production]
16:23 <fabfur@cumin1001> START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 8 hosts matching query A:cp-text_codfw [production]
16:14 <hnowlan> enabling puppet on A:cp, routing mediarequests API via rest-gateway [production]
16:03 <hnowlan> disabled puppet on A:cp [production]
15:54 <hnowlan@deploy1002> helmfile [codfw] DONE helmfile.d/services/media-analytics: apply [production]
15:53 <hnowlan@deploy1002> helmfile [codfw] START helmfile.d/services/media-analytics: apply [production]
15:49 <hnowlan@deploy1002> helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply [production]
15:49 <hnowlan@deploy1002> helmfile [eqiad] START helmfile.d/services/media-analytics: apply [production]
15:49 <hnowlan@deploy1002> helmfile [staging] DONE helmfile.d/services/media-analytics: apply [production]
15:48 <hnowlan@deploy1002> helmfile [staging] START helmfile.d/services/media-analytics: apply [production]
15:48 <brennen@deploy2002> Sync cancelled. [production]
15:48 <fabfur@cumin1001> END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 8 hosts matching query A:cp-text_ulsfo [production]
15:47 <brennen@deploy2002> brennen: Backport for [[gerrit:961719|Revert "NostalgiaTemplate.php: Fix array illegal offset error"]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
15:47 <hnowlan@deploy1002> helmfile [staging] DONE helmfile.d/services/rest-gateway: apply [production]
15:47 <hnowlan@deploy1002> helmfile [staging] START helmfile.d/services/rest-gateway: apply [production]
15:46 <brennen@deploy2002> Started scap: Backport for [[gerrit:961719|Revert "NostalgiaTemplate.php: Fix array illegal offset error"]] [production]
15:39 <brennen@deploy2002> Sync cancelled. [production]
15:38 <brennen@deploy2002> krinkle and brennen: Backport for [[gerrit:961717|NostalgiaTemplate.php: Fix array illegal offset error]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
15:36 <brennen@deploy2002> Started scap: Backport for [[gerrit:961717|NostalgiaTemplate.php: Fix array illegal offset error]] [production]
15:27 <fabfur@cumin1001> END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 7 hosts matching query A:cp-upload_ulsfo and not P{cp4052*} [production]
15:06 <tchin@deploy2002> helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply [production]
15:05 <tchin@deploy2002> helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply [production]
15:03 <jhancock@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1229.eqiad.wmnet with OS bullseye [production]
14:49 <inflatador> bking@wdqs1016 shutting down services to compress a 1.2 TB jnl file [production]
14:43 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1119', diff saved to https://phabricator.wikimedia.org/P52725 and previous config saved to /var/cache/conftool/dbconfig/20230928-144338-root.json [production]
14:35 <moritzm> installing ghostscript security updates [production]
14:32 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs1016.eqiad.wmnet with reason: jnl compression [production]
14:32 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs1016.eqiad.wmnet with reason: jnl compression [production]
14:13 <klausman> restarting pybal on lvs1019 and lvs2013 (LVS low-traffic actives) for T347278 (ORES turndown) [production]
14:11 <arnaudb@cumin1001> dbctl commit (dc=all): 'Depooling db2169:3316 (T343198)', diff saved to https://phabricator.wikimedia.org/P52723 and previous config saved to /var/cache/conftool/dbconfig/20230928-141140-arnaudb.json [production]
14:11 <arnaudb@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance [production]
14:11 <arnaudb@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance [production]
14:11 <arnaudb@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2158 (T343198)', diff saved to https://phabricator.wikimedia.org/P52722 and previous config saved to /var/cache/conftool/dbconfig/20230928-141118-arnaudb.json [production]
14:08 <cdanis> repooling cp5030 after haproxy upgrade & config deploy T317799 [production]
14:02 <jhancock@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1228.eqiad.wmnet with OS bullseye [production]
14:02 <jhancock@cumin2002> START - Cookbook sre.hosts.reimage for host db1228.eqiad.wmnet with OS bullseye [production]
14:02 <cdanis> depooling cp5030 for haproxy upgrade & testing T317799 [production]
14:01 <moritzm> installing gsl security updates [production]
14:00 <klausman> restarted pybal on lvs1020 and lvs2014 (LVS low-traffic backups) for T347278 (ORES turndown) [production]