2019-02-11
§
|
10:47 |
<godog> |
shut deployment-prometheus01, unused now |
[deployment-prep] |
10:47 |
<jdrewniak@deploy1001> |
Synchronized portals: Wikimedia Portals Update: [[gerrit:489649| Bumping portals to master (T128546)]] (duration: 00m 46s) |
[production] |
10:46 |
<jdrewniak@deploy1001> |
Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:489649| Bumping portals to master (T128546)]] (duration: 00m 48s) |
[production] |
10:41 |
<godog> |
flip tools-prometheus proxy back to tools-prometheus-01 and upgrade to prometheus 2.7.1 |
[tools] |
10:41 |
<jynus> |
upgrading mariadb client on cumin* hosts |
[production] |
10:27 |
<mvolz@deploy1001> |
scap-helm zotero finished |
[production] |
10:27 |
<mvolz@deploy1001> |
scap-helm zotero cluster codfw completed |
[production] |
10:27 |
<mvolz@deploy1001> |
scap-helm zotero upgrade production -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw] |
[production] |
10:24 |
<mvolz@deploy1001> |
scap-helm zotero finished |
[production] |
10:24 |
<mvolz@deploy1001> |
scap-helm zotero cluster eqiad completed |
[production] |
10:24 |
<mvolz@deploy1001> |
scap-helm zotero upgrade production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad] |
[production] |
10:19 |
<marostegui> |
Add dbstore1005:3350 to tendril and zarcillo - T210478 |
[production] |
10:17 |
<mvolz@deploy1001> |
scap-helm zotero finished |
[production] |
10:17 |
<mvolz@deploy1001> |
scap-helm zotero cluster staging completed |
[production] |
10:17 |
<mvolz@deploy1001> |
scap-helm zotero upgrade staging -f zotero-values-staging.yaml --version=0.0.1 stable/zotero [namespace: zotero, clusters: staging] |
[production] |
10:17 |
<jynus> |
restart db1114 |
[production] |
10:01 |
<elukey> |
restart superset to pick up new config.py changes |
[analytics] |
09:38 |
<marostegui> |
Stop all mysql instances on dbstore1005 for reboot |
[production] |
09:11 |
<marostegui> |
Stop all mysql instances on dbstore1003 for reboot |
[production] |
08:38 |
<elukey> |
restart superset to pick up new settings in config.py |
[analytics] |
08:17 |
<moritzm> |
removed cloudcontrol2001-dev.codfw.wmnet from debmonitor (actual hostname in use is cloudcontrol2001-dev.wikimedia.org) |
[production] |
08:07 |
<marostegui> |
Deploy schema change on s8 codfw master (db2045) - this will generate lag on codfw T210713 |
[production] |
07:43 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Fully repool db1100 (duration: 00m 46s) |
[production] |
07:39 |
<marostegui> |
Deploy schema change on s7 primary master (db1062) - T210713 |
[production] |
07:27 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Give api traffic to db1100 (duration: 00m 46s) |
[production] |
07:18 |
<marostegui> |
Stop all mysql instances on dbstore1004 for a reboot |
[production] |
07:17 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Repool db1100 with low weight (duration: 00m 46s) |
[production] |
07:06 |
<marostegui> |
Upgrade MySQL on db1100 |
[production] |
07:06 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Depool db1100 for mysql upgrade (duration: 00m 47s) |
[production] |
07:00 |
<marostegui> |
Restart icinga on icinga1001 - checks went awol |
[production] |
06:51 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Repool db1079 (duration: 00m 48s) |
[production] |
06:14 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Depool db1079 (duration: 00m 48s) |
[production] |
06:14 |
<marostegui@deploy1001> |
sync-file aborted: Depool db0179 (duration: 00m 01s) |
[production] |
04:23 |
<TimStarling> |
on mwmaint1002: running normalizeThrottleParameters.php --dry-run on all wikis (T209565) |
[production] |
04:19 |
<tstarling@deploy1001> |
Synchronized php-1.33.0-wmf.16/extensions/AbuseFilter/maintenance/normalizeThrottleParameters.php: maintenance script update for new dry run (duration: 00m 47s) |
[production] |
04:19 |
<tstarling@deploy1001> |
Synchronized php-1.33.0-wmf.16/extensions/WikimediaEvents/tests/phpunit/PageViewsTest.php: test-only undeployed change (duration: 00m 46s) |
[production] |
04:18 |
<tstarling@deploy1001> |
Synchronized php-1.33.0-wmf.16/extensions/NavigationTiming/tests/ext.navigationTiming.test.js: test-only undeployed change (duration: 00m 51s) |
[production] |
04:10 |
<tstarling@deploy1001> |
sync-file aborted: test-only undeployed change (duration: 00m 12s) |
[production] |
03:06 |
<Reedy> |
graceful restart of zuul as no jobs were running |
[releng] |
03:05 |
<kartik@deploy1001> |
Finished deploy [cxserver/deploy@ee4a15a]: Update cxserver to 8928852 (T213256) (duration: 04m 08s) |
[production] |
03:00 |
<kartik@deploy1001> |
Started deploy [cxserver/deploy@ee4a15a]: Update cxserver to 8928852 (T213256) |
[production] |
00:27 |
<bd808> |
Restarted webservice. Looks like someone found a slow IP to look up, handed that URL to a bunch of folks, and eventually locked up all the lighttpd threads with the same slow query |
[tools.whois] |
2019-02-10
§
|
20:07 |
<bd808> |
Deploy 5f20413 Make labels for legacy Trusty grid explicit (T215712) |
[tools.admin] |
19:59 |
<volans|off> |
force rebooting mw1299, stuck again - T215569 |
[production] |
19:16 |
<volans|off> |
forcing reboot of icinga1001 because it's stuck again (no ping, no ssh, CPU stuck messages on console) - T214760 |
[production] |
10:52 |
<elukey> |
re-run webrequest upload webrequest-load-wf-upload-2019-2-10-0 |
[analytics] |
10:52 |
<elukey> |
killed oozie job related to webrequest-load-wf-upload-2019-2-10-0, seemed stuck in generate_sequence_statistics (not really clear why) |
[analytics] |
09:25 |
<marostegui> |
Disable notifications for lag checks on dbstore1002 - T210478 |
[production] |
03:22 |
<Krinkle> |
Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/489434 (Create quibble-stretch-hhvm, replacing jessie) |
[releng] |
02:06 |
<Krinkle> |
Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/489430 |
[releng] |