1851-1900 of 10000 results (24ms)
2024-08-08 §
15:21 <klausman@cumin2002> START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ml-serve2004.codfw.wmnet with reason: Hardware maintenance for memory errors [production]
12:47 <jnuche@deploy1003> rebuilt and synchronized wikiversions files: group2 to 1.43.0-wmf.17 refs T366962 [production]
10:39 <jnuche@deploy1003> rebuilt and synchronized wikiversions files: group1 to 1.43.0-wmf.17 refs T366962 [production]
09:53 <jnuche@deploy1003> rebuilt and synchronized wikiversions files: group2 to 1.43.0-wmf.17 refs T366962 [production]
2024-08-07 §
08:18 <jnuche@deploy1003> rebuilt and synchronized wikiversions files: group1 to 1.43.0-wmf.17 refs T366962 [production]
2024-08-06 §
20:26 <ryankemper@cumin2002> END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) reloading scholarly_articles on wdqs1023.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/main/20240729/ using stat1009.eqiad.wmnet) [production]
19:49 <ryankemper@cumin2002> START - Cookbook sre.wdqs.data-reload reloading scholarly_articles on wdqs1023.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/main/20240729/ using stat1009.eqiad.wmnet) [production]
08:16 <jnuche@deploy1003> rebuilt and synchronized wikiversions files: group0 to 1.43.0-wmf.17 refs T366962 [production]
05:41 <ryankemper@cumin2002> END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) reloading scholarly_articles on wdqs1023.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/main/20240729/ using stat1009.eqiad.wmnet) [production]
04:39 <ryankemper@cumin2002> START - Cookbook sre.wdqs.data-reload reloading scholarly_articles on wdqs1023.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/main/20240729/ using stat1009.eqiad.wmnet) [production]
04:38 <ryankemper@cumin2002> START - Cookbook sre.wdqs.data-reload reloading wikidata_main on wdqs1021.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/main/20240729/ using stat1009.eqiad.wmnet) [production]
04:01 <mwpresync@deploy1003> Pruned MediaWiki: 1.43.0-wmf.14 (duration: 00m 58s) [production]
03:47 <mwpresync@deploy1003> Finished scap: testwikis to 1.43.0-wmf.17 refs T366962 (duration: 45m 05s) [production]
03:02 <mwpresync@deploy1003> Started scap sync-world: testwikis to 1.43.0-wmf.17 refs T366962 [production]
2024-08-05 §
19:29 <ryankemper@cumin2002> END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) reloading wikidata_main on wdqs1021.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/main/20240729/ using stat1009.eqiad.wmnet) [production]
18:52 <ryankemper@cumin2002> START - Cookbook sre.wdqs.data-reload reloading wikidata_main on wdqs1021.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/main/20240729/ using stat1009.eqiad.wmnet) [production]
2024-08-01 §
18:10 <brennen@deploy1003> rebuilt and synchronized wikiversions files: group2 to 1.43.0-wmf.16 refs T366961 [production]
18:00 <brennen> 1.43.0-wmf.16 train (T366961): no current blockers, logs cluttered but not too scary, rolling to all wikis. [production]
07:47 <ayounsi@cumin1002> END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netbox 4 sync - ayounsi@cumin1002" [production]
07:39 <ayounsi@cumin1002> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netbox 4 sync - ayounsi@cumin1002" [production]
2024-07-31 §
20:12 <cjming@deploy1003> Finished scap: Backport for [[gerrit:1056495|[wmf-config] Remove trailing slash in SSO domain]] (duration: 08m 04s) [production]
20:06 <cjming@deploy1003> cjming, d3r1ck01: Backport for [[gerrit:1056495|[wmf-config] Remove trailing slash in SSO domain]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
20:04 <cjming@deploy1003> Started scap sync-world: Backport for [[gerrit:1056495|[wmf-config] Remove trailing slash in SSO domain]] [production]
18:17 <brennen@deploy1003> rebuilt and synchronized wikiversions files: group1 to 1.43.0-wmf.16 refs T366961 [production]
18:09 <brennen> 1.43.0-wmf.16 train (T366961): no current blockers, logs clean, rolling to group1. [production]
2024-07-30 §
18:53 <brennen@deploy1003> rebuilt and synchronized wikiversions files: group0 to 1.43.0-wmf.16 refs T366961 [production]
18:33 <brennen> 1.43.0-wmf.16 train (T366961): blockers resolved, rolling to group0 [production]
04:07 <mwpresync@deploy1003> Pruned MediaWiki: 1.43.0-wmf.13 (duration: 06m 51s) [production]
03:02 <mwpresync@deploy1003> Started scap sync-world: testwikis to 1.43.0-wmf.16 refs T366961 [production]
2024-07-29 §
13:45 <logmsgbot> lucaswerkmeister-wmde@deploy1003 Synchronized php-1.43.0-wmf.15/extensions/ContentTranslation/extension.json: Backport for [[gerrit:1057853|AX: Unregister "axArticleFooterEntrypointRegistrar" hook handler (T363338)]] (duration: 06m 36s) [production]
13:24 <logmsgbot> lucaswerkmeister-wmde@deploy1003 Synchronized wmf-config/: Backport for [[gerrit:1055434|Enable mul language code on Wikidata (limited mode) (T330281)]] (duration: 06m 47s) [production]
2024-07-25 §
18:12 <dduvall@deploy1002> rebuilt and synchronized wikiversions files: group2 to 1.43.0-wmf.15 refs T366960 [production]
17:56 <ayounsi@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on netbox2003.codfw.wmnet,netbox1003.eqiad.wmnet with reason: netbox upgrade prep work [production]
17:56 <ayounsi@cumin1002> START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on netbox2003.codfw.wmnet,netbox1003.eqiad.wmnet with reason: netbox upgrade prep work [production]
17:06 <swfrench-wmf> running homer 'cr*eqiad*' commit 'T351074' for k8s worker reimage [production]
2024-07-24 §
18:10 <dduvall@deploy1002> rebuilt and synchronized wikiversions files: group1 to 1.43.0-wmf.15 refs T366960 [production]
2024-07-23 §
18:14 <dduvall@deploy1002> rebuilt and synchronized wikiversions files: group0 to 1.43.0-wmf.15 refs T366960 [production]
18:13 <swfrench-wmf> sudo cumin 'A:lvs-secondary-eqiad or A:lvs-low-traffic-eqiad' 'ipvsadm --delete-service --tcp-service 10.2.2.1:443' (appservers-https eqiad) - T367949 [production]
18:11 <swfrench-wmf> sudo cumin 'A:lvs-secondary-eqiad or A:lvs-low-traffic-eqiad' 'ipvsadm --delete-service --tcp-service 10.2.2.22:443' (api-https eqiad) - T367949 [production]
18:11 <swfrench-wmf> sudo cumin 'A:lvs-secondary-eqiad or A:lvs-low-traffic-eqiad' 'ipvsa [production]
18:10 <swfrench-wmf> sudo cumin 'A:lvs-secondary-codfw or A:lvs-low-traffic-codfw' 'ipvsa [production]
18:08 <swfrench-wmf> sudo cumin 'A:lvs-secondary-codfw or A:lvs-low-traffic-codfw' 'ipvsa [production]
17:58 <swfrench-wmf> sudo cumin 'A:lvs-low-traffic-eqiad' 'systemctl restart pybal.service' - T367949 [production]
17:51 <swfrench-wmf> sudo cumin 'A:lvs-secondary-eqiad' 'systemctl restart pybal.service' - T367949 [production]
17:46 <logmsgbot> nshahquinn-wmf@deploy1002 Finished deploy [airflow-dags/analytics_product@ebd9e13]: (no justification provided) (duration: 00m 07s) [production]
17:46 <logmsgbot> nshahquinn-wmf@deploy1002 Started deploy [airflow-dags/analytics_product@ebd9e13]: (no justification provided) [production]
17:44 <swfrench-wmf> sudo cumin 'A:lvs-low-traffic-codfw' 'systemctl restart pybal.service' - T367949 [production]
17:28 <swfrench-wmf> run-puppet-agent on O:lvs::balancer to pick up switch to service_setup, removal of profile::lvs::realserver::pools - T367949 [production]
17:17 <swfrench-wmf> run-puppet-agent on A:dnsbox to pick up switch to lvs_setup - T367949 [production]
17:06 <swfrench-wmf> ran authdns-update on dns1004 to pick up removal of appservers / api records - T367949 [production]