3801-3850 of 10000 results (23ms)
2020-09-04 §
16:19 <James_F> Zuul: Voting FR jobs for ParserFunctions and cldr. [releng]
15:32 <hashar> Updated doc.wikimedia.org docroot for https://gerrit.wikimedia.org/r/c/integration/docroot/+/624714 [releng]
14:10 <Reedy> rebooted due to laggy irc echoing [tools.wikibugs]
10:31 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) [production]
10:29 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1087 for MCR schema change', diff saved to https://phabricator.wikimedia.org/P12492 and previous config saved to /var/cache/conftool/dbconfig/20200904-102955-marostegui.json [production]
10:28 <marostegui> Deploy MCR schema change on db1087 (sanitarium master), this will generate lag (probably a few days) on s8 labsdb hosts T238966 [production]
09:48 <marostegui> Restart prometheus-mysqld-exporter on db2125 [production]
09:39 <rxy> Flags +AV were set on Mirinano in #cvn-ja. [cvn]
09:39 <rxy> Flags +AV were set on Mirinano in #cvn-sw. [cvn]
09:11 <elukey@cumin1001> START - Cookbook sre.hadoop.roll-restart-workers [production]
08:58 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) [production]
08:31 <elukey@cumin1001> START - Cookbook sre.hadoop.roll-restart-workers [production]
08:29 <elukey> roll restart of the hadoop workers (test and analytics cluster) for openjdk upgrades [production]
08:08 <moritzm> installing 4.19.132 kernel on buster systems (only installing the deb, reboots separately) [production]
07:30 <moritzm> installing 4.9.228 kernel on stretch systems (only installing the deb, reboots separately) [production]
07:07 <wm-bot> <jeanfred> Deploy latest from Git master: 3099ce3 [tools.wikiloves]
06:54 <joal> Manually restart mediawiki-history-drop-snapshot after hive-partitions/hdfs-folders mismatch fix [analytics]
06:08 <elukey> reset-failed mediawiki-history-drop-snapshot on an-launcher1002 to clear icinga errors [analytics]
05:13 <marostegui> Deploy MCR schema change on s4 eqiad master T238966 [production]
01:52 <milimetric> aborted aqs deploy due to cassandra error [analytics]
01:51 <milimetric@deploy1001> Finished deploy [analytics/aqs/deploy@95d6432]: AQS: Deploying new geoeditors endpoints (duration: 63m 18s) [production]
01:35 <pt1979@cumin2001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
01:30 <pt1979@cumin2001> START - Cookbook sre.dns.netbox [production]
01:23 <ryankemper> (Following the restart of blazegraph, service has been restored to `wdqs2003`. See https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&var-cluster_name=wdqs&from=1599182219699&to=1599182547699) [production]
01:16 <ryankemper> Glancing at https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&var-cluster_name=wdqs&from=1599170628749&to=1599182011243, looks like `wdqs2003`'s blazegaph isn't happy based off the null data entries. Restarting blazegraph: `ryankemper@wdqs2003:~$ sudo systemctl restart wdqs-blazegraph` [production]
00:48 <milimetric@deploy1001> Started deploy [analytics/aqs/deploy@95d6432]: AQS: Deploying new geoeditors endpoints [production]
2020-09-03 §
23:31 <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: 93947391e97be11a9cd7eb4713b274b05d5b371a: Start logging log-ins on select wikis (T253802) (duration: 00m 56s) [production]
22:18 <legoktm> manually kicking mirror script, it apparently got stuck on 2020-07-01 [packagist-mirror]
22:10 <legoktm> switch domain to wmcloud.org [packagist-mirror]
21:50 <legoktm> added libraryupgrader2.wmcloud.org DNS proxy and remove wmflabs.org one for automatic redirect (T261995) [library-upgrader]
21:18 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
21:15 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
20:14 <balloons> increased cores to 24 and ram to 49152 [wmde-dashboards]
19:55 <milimetric@deploy1001> deploy aborted: AQS: Deploying new geoeditors endpoints (duration: 00m 13s) [production]
19:54 <milimetric@deploy1001> Started deploy [analytics/aqs/deploy@95d6432]: AQS: Deploying new geoeditors endpoints [production]
19:15 <milimetric> finished deploying refinery and refinery-source, restarting jobs now [analytics]
19:07 <milimetric@deploy1001> Finished deploy [analytics/refinery@e4d5149] (thin): Regular analytics weekly train THIN [analytics/refinery@e4d5149] (duration: 00m 08s) [production]
19:07 <milimetric@deploy1001> Started deploy [analytics/refinery@e4d5149] (thin): Regular analytics weekly train THIN [analytics/refinery@e4d5149] [production]
19:06 <milimetric@deploy1001> Finished deploy [analytics/refinery@e4d5149]: Regular analytics weekly train [analytics/refinery@e4d5149] (duration: 09m 06s) [production]
18:57 <milimetric@deploy1001> Started deploy [analytics/refinery@e4d5149]: Regular analytics weekly train [analytics/refinery@e4d5149] [production]
17:50 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
17:48 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
17:47 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:46 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
17:46 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
17:45 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:44 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:43 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
17:43 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:41 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]