1251-1300 of 10000 results (24ms)
2020-05-07 ยง
13:33 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
13:30 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
13:30 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
13:28 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
13:21 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
13:19 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
13:12 <hashar@deploy1001> rebuilt and synchronized wikiversions files: (no justification provided) [production]
13:04 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
13:04 <jynus> disabling puppet on all db hosts to control deployment of new paging alert T172489 [production]
13:02 <zpapierski@deploy1001> Finished deploy [wdqs/wdqs@94906d0]: Deploy WDQS 0.3.28 + GUI - new servers (duration: 02m 43s) [production]
13:01 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
12:59 <zpapierski@deploy1001> Started deploy [wdqs/wdqs@94906d0]: Deploy WDQS 0.3.28 + GUI - new servers [production]
12:50 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
12:48 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime [production]
12:43 <zpapierski@deploy1001> Finished deploy [wdqs/wdqs@94906d0]: Deploy WDQS 0.3.28 + GUI (duration: 16m 20s) [production]
12:36 <arturo> cleanup livehacks in toolsbeta-puppetmaster-03 [toolsbeta]
12:34 <mutante> removing role::labs::lvm::srv from deployment servers since this is now included in role:deployment_server and should neve have been a role in the first place [releng]
12:34 <mutante> removing role::labs::lvm::srv from deployment servers since this is now included in role:deployment_server and should neve have been a role in the first place [deployment-prep]
12:27 <zpapierski@deploy1001> Started deploy [wdqs/wdqs@94906d0]: Deploy WDQS 0.3.28 + GUI [production]
12:13 <addshore@deploy1001> Synchronized php-1.35.0-wmf.31/extensions/Wikibase: [[gerrit:594920]] T252079 Revert "Move prefetching-term-lookup-callback service wiring" (duration: 01m 12s) [production]
12:12 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
12:10 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
12:07 <mutante> - puppet still broken on deployment_servers due to unrelated pre-existing issues, also no alerts about it in shinken [releng]
12:07 <mutante> - puppet still broken on deployment_servers due to unrelated pre-existing issues, also no alerts about it in shinken [deployment-prep]
12:04 <mutante> - puppet broken on deployment_servers - fix deployed in https://gerrit.wikimedia.org/r/c/operations/puppet/+/594932 [releng]
12:04 <mutante> - puppet broken on deployment_servers - fix deployed in https://gerrit.wikimedia.org/r/c/operations/puppet/+/594932 [deployment-prep]
11:55 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
11:53 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
11:33 <moritzm> imported component/puppet5 for jessie-wikimedia into "main" [production]
11:31 <jbond42> enable ferm-status script https://gerrit.wikimedia.org/r/c/operations/puppet/+/576102 [production]
11:12 <arturo> livehack toolsbeta-puppetmaster-03 with https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/594925 and https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/594926 (T251297 and T250866) [toolsbeta]
11:10 <matthiasmullie> EU swat done [production]
11:07 <mlitn@deploy1001> Synchronized php-1.35.0-wmf.31/extensions/WikibaseMediaInfo/: [MediaInfo] Add dummy concept chips without thumbnail (duration: 01m 09s) [production]
11:00 <joal> Moving application_1583418280867_334532 to the nice queue [analytics]
10:58 <joal> Rerun wikidata-articleplaceholder_metrics-wf-2020-5-6 [analytics]
10:07 <moritzm> installing Java security updates on restbase/sessionstore [production]
09:24 <mutante> - cloud puppetmasters still affected by https://phabricator.wikimedia.org/T83447#5807825 [devtools]
09:11 <elukey> roll restart cassandra on aqs1005 to pick up new openjdk upgrades (canary) [production]
09:07 <mutante> - puppetmaster-1001 - Permission denied @ rb_sysopen - /var/lib/puppet/volatile/GeoIP/.geoipupdate.lock [devtools]
09:06 <mutante> - avoiding the need for a second role for deployment_servers in cloud with https://gerrit.wikimedia.org/r/c/operations/puppet/+/594903 [devtools]
09:05 <mutante> - puppet fixed on deploy-1002 with https://gerrit.wikimedia.org/r/c/operations/puppet/+/594900 [devtools]
08:32 <moritzm> upgrading restbase-dev to latest OpenJDK security update [production]
08:06 <jynus> setting pc2007, pc2009 as read-write [production]
08:04 <mutante> - broken puppet again from prod changes. this time: deploy-1002 - []' is not applicable to an Undef Value. mediawiki/mcrouter_wancache.pp, line: 19 [devtools]
07:59 <mutante> - shutting down instance puppet-paladox, backups created and uploaded to deploy-1002 in devtools (T236569) [git]
07:55 <mutante> - shutting down instance gerrit-test7, backups created and uploaded to deploy1002 in devtools, disassociating floating IP (T236569) [git]
07:45 <elukey> re-run mediawiki-history-denormalize [analytics]
07:44 <godog> further decrease weight for ms-be10[678] - T252008 [production]
07:43 <elukey> kill application_1583418280867_333560 after a chat with David, the job is consuming ~2TB of RAM [analytics]
07:32 <elukey> re-run mediawiki history load [analytics]