2022-03-17
§
|
06:26 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 25%: After buffer pool testing', diff saved to https://phabricator.wikimedia.org/P22739 and previous config saved to /var/cache/conftool/dbconfig/20220317-062648-root.json |
[production] |
06:15 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance |
[production] |
06:15 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance |
[production] |
06:11 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 10%: After buffer pool testing', diff saved to https://phabricator.wikimedia.org/P22738 and previous config saved to /var/cache/conftool/dbconfig/20220317-061144-root.json |
[production] |
04:06 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1146:3314 (T300775)', diff saved to https://phabricator.wikimedia.org/P22737 and previous config saved to /var/cache/conftool/dbconfig/20220317-040634-marostegui.json |
[production] |
04:06 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance |
[production] |
04:06 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance |
[production] |
02:57 |
<andrew@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1016.eqiad.wmnet with OS bullseye |
[production] |
02:07 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.reimage for host cloudvirt1016.eqiad.wmnet with OS bullseye |
[production] |
02:07 |
<andrew@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1016.eqiad.wmnet with OS bullseye |
[production] |
01:11 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.reimage for host cloudvirt1016.eqiad.wmnet with OS bullseye |
[production] |
2022-03-16
§
|
23:52 |
<tzatziki> |
Removing two files for legal compliance |
[production] |
21:17 |
<cjming> |
end running skin update preference maintenance script |
[production] |
20:52 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dumpsdata1006.mgmt.eqiad.wmnet with reboot policy FORCED |
[production] |
20:40 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: [no-op] 8efa537: GrowthExperiments: Set GEWelcomeSurveyShowMailingListQuestion (T303240) (duration: 00m 53s) |
[production] |
20:38 |
<robh@cumin1001> |
START - Cookbook sre.hosts.provision for host dumpsdata1006.mgmt.eqiad.wmnet with reboot policy FORCED |
[production] |
20:35 |
<urbanecm@deploy1002> |
Synchronized php-1.38.0-wmf.26/extensions/WikimediaMaintenance/: 9ba157b: Add insert option for update skin preferences script (T299104) (duration: 00m 50s) |
[production] |
20:34 |
<urbanecm@deploy1002> |
Synchronized php-1.38.0-wmf.25/extensions/WikimediaMaintenance/: ebfc516: Add script to update vector skin preferences (T299104) (duration: 00m 51s) |
[production] |
20:32 |
<robh@cumin1001> |
END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dumpsdata1006.mgmt.eqiad.wmnet with reboot policy FORCED |
[production] |
20:24 |
<pt1979@cumin1001> |
START - Cookbook sre.hosts.reimage for host cloudvirt1025.eqiad.wmnet with OS bullseye |
[production] |
20:13 |
<robh@cumin1001> |
START - Cookbook sre.hosts.provision for host dumpsdata1006.mgmt.eqiad.wmnet with reboot policy FORCED |
[production] |
20:13 |
<urbanecm@deploy1002> |
Synchronized docroot/noc/db.php: f649199: Migrate wmfDatacenter(s) to wmgDatacenter(s) (T45956; 3/3) (duration: 00m 49s) |
[production] |
20:12 |
<urbanecm@deploy1002> |
Synchronized multiversion/: f649199: Migrate wmfDatacenter(s) to wmgDatacenter(s) (T45956; 2/3) (duration: 00m 50s) |
[production] |
20:11 |
<urbanecm@deploy1002> |
Synchronized wmf-config/: f649199: Migrate wmfDatacenter(s) to wmgDatacenter(s) (T45956; 1/3) (duration: 00m 50s) |
[production] |
19:22 |
<otto@deploy1002> |
Finished deploy [analytics/refinery@2d2056a] (hadoop-test): (no justification provided) (duration: 07m 50s) |
[production] |
19:14 |
<otto@deploy1002> |
Started deploy [analytics/refinery@2d2056a] (hadoop-test): (no justification provided) |
[production] |
18:32 |
<sukhe> |
running: homer "cr*-drmrs*" commit "Gerrit 771359: Set up BGP peering in drmrs for Wikidough." |
[production] |
18:09 |
<aqu@deploy1002> |
Finished deploy [airflow-dags/analytics_test@257960f]: Migrate session_length/daily from Oozie to Airflow [airflow-dags/analytics_test@257960f] (duration: 00m 08s) |
[production] |
18:09 |
<aqu@deploy1002> |
Started deploy [airflow-dags/analytics_test@257960f]: Migrate session_length/daily from Oozie to Airflow [airflow-dags/analytics_test@257960f] |
[production] |
18:02 |
<aqu@deploy1002> |
Finished deploy [airflow-dags/analytics@257960f]: Migrate session_length/daily from Oozie to Airflow [airflow-dags/analytics@257960f] (duration: 00m 08s) |
[production] |
18:02 |
<aqu@deploy1002> |
Started deploy [airflow-dags/analytics@257960f]: Migrate session_length/daily from Oozie to Airflow [airflow-dags/analytics@257960f] |
[production] |
18:00 |
<razzi@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on karapace1001.eqiad.wmnet with reason: Setting up karapace for the first time |
[production] |
18:00 |
<razzi@cumin1001> |
START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on karapace1001.eqiad.wmnet with reason: Setting up karapace for the first time |
[production] |
17:36 |
<dancy@deploy1002> |
Synchronized multiversion/MWMultiVersion.php: Config: [[gerrit:771001|mwscript: Support --force-version flag (T303878)]] (duration: 00m 57s) |
[production] |
17:21 |
<sukhe@puppetmaster1001> |
conftool action : set/pooled=yes; selector: dc=drmrs,cluster=cache_text,service=ats-tls |
[production] |
17:21 |
<sukhe@puppetmaster1001> |
conftool action : set/pooled=yes; selector: dc=drmrs,cluster=cache_text,service=varnish-fe |
[production] |
17:21 |
<sukhe@puppetmaster1001> |
conftool action : set/pooled=yes; selector: dc=drmrs,cluster=cache_text,service=ats-be |
[production] |
17:13 |
<aqu@deploy1002> |
Finished deploy [analytics/refinery@d039471] (hadoop-test): Migrate session_length/daily from Oozie to Airflow [analytics/refinery@d039471] (duration: 07m 23s) |
[production] |
17:11 |
<sukhe@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6016.drmrs.wmnet with OS buster |
[production] |
17:06 |
<aqu@deploy1002> |
Started deploy [analytics/refinery@d039471] (hadoop-test): Migrate session_length/daily from Oozie to Airflow [analytics/refinery@d039471] |
[production] |
17:06 |
<aqu@deploy1002> |
Finished deploy [analytics/refinery@d039471] (thin): Migrate session_length/daily from Oozie to Airflow [analytics/refinery@d039471] (duration: 00m 07s) |
[production] |
17:06 |
<aqu@deploy1002> |
Started deploy [analytics/refinery@d039471] (thin): Migrate session_length/daily from Oozie to Airflow [analytics/refinery@d039471] |
[production] |
17:04 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance |
[production] |
17:04 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance |
[production] |
16:48 |
<aqu@deploy1002> |
Finished deploy [analytics/refinery@d039471]: Migrate session_length/daily from Oozie to Airflow [analytics/refinery@d039471] (duration: 25m 49s) |
[production] |
16:45 |
<pt1979@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1024.eqiad.wmnet with OS bullseye |
[production] |
16:36 |
<Emperor> |
rolling restart of ms-fe10[09-12] so they know about removal of older proxies T303733 |
[production] |
16:30 |
<sukhe@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6016.drmrs.wmnet with reason: host reimage |
[production] |
16:28 |
<Emperor> |
moving swiftrepl and stats reporter host from ms-fe1005 to ms-fe1009 T303733 |
[production] |
16:27 |
<sukhe@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cp6016.drmrs.wmnet with reason: host reimage |
[production] |