901-950 of 3675 results (9ms)
2020-06-20 §
07:41 <elukey> powercycle an-worker1093 - bug soft lock up CPU showed in mgmt console [analytics]
07:37 <elukey> powercycle an-worker1091 - bug soft lock up CPU showed in mgmt console [analytics]
2020-06-17 §
19:59 <milimetric> deployed quick fix for data stats job [analytics]
18:04 <elukey> decommission matomo1001 [analytics]
16:57 <ottomata> produce searchsatisfaction events on group0 wikis via eventgate - T249261 [analytics]
07:17 <joal> Deleting mediawiki-history-text (avro) for 2020-01 and 2020-02 (we still have 2020-03 and 2020-04) - Expected free space: 160Tb [analytics]
06:40 <elukey> reboot krb1001 for kernel upgrades [analytics]
06:24 <elukey> reboot an-master100[1,2] for kernel upgrades [analytics]
06:03 <elukey> reboot an-conf100[1-3] for kernel upgrades [analytics]
05:45 <elukey> reboot stat1007/8 for kernel upgrades [analytics]
2020-06-16 §
19:58 <ottomata> evolving event.SearchSatisfaction Hive table using /analytics/legacy/searchsatisfaction/latest schema [analytics]
19:41 <ottomata> bumping Refine refinery jar version to 0.0.127 - T238230 [analytics]
19:17 <ottomata> deploying refinery source 0.0.127 for eventlogging -> eventgate migration - T249261 [analytics]
16:02 <elukey> reboot kafka-jumbo1008 for kernel upgrades [analytics]
15:33 <milimetric> refinery deployed and synced to hdfs, with refinery-source at 0.0.126 [analytics]
15:20 <elukey> reboot kafka-jumbo1007 for kernel upgrades [analytics]
15:13 <elukey> re-enabling timers on launcher after maintenance [analytics]
15:06 <elukey> reboot an-coord1001 for kernel upgrades [analytics]
14:27 <elukey> stop timers on an-launcher1001, prep before rebooting an-coord1001 [analytics]
14:23 <elukey> reboot druid100[7,8] for kernel upgrades [analytics]
11:51 <elukey> re-run webrequest-druid-hourly-coord 16/06T10 [analytics]
11:36 <elukey> reboot an-druid100[1,2] for kernel upgrades [analytics]
2020-06-15 §
09:37 <elukey> restart refinery-druid-drop-public-snapshots.service after change in vlan firewall rules (added druid100[7,8] to term druid) [analytics]
2020-06-11 §
15:01 <mforns> started refinery deploy for v0.0.126 [analytics]
14:58 <mforns> deployed refinery-source v0.0.126 [analytics]
13:57 <ottomata> removed accidentally added page_restrictions column(s) on Hive table event.mediawiki_user_blocks_change after a incorrect schema change was merged (no data was ever set in this column) [analytics]
2020-06-09 §
07:32 <elukey> upgrade ROCm to 3.3 on stat1005 [analytics]
2020-06-08 §
15:42 <elukey> remove access to notebook100[3,4] - T249752 [analytics]
14:07 <elukey> move matomo cron archiver to systemd timer archiver (with nagios alarming) [analytics]
14:02 <elukey> re-enable timers on an-coord1001 [analytics]
14:01 <elukey> restart hive/oozie on an-coord1001 for openjdk upgrades [analytics]
13:42 <elukey> roll restart kafka jumbo brokers for openjdk upgrades [analytics]
13:26 <elukey> stop timers on an-launcher to drain jobs and restart hive/oozie for openjdk upgrades [analytics]
2020-06-05 §
17:56 <elukey> roll restart presto server on an-presto* to pick up new openjdk upgrades [analytics]
16:45 <elukey> upgrade turnilo to 1.24.0 [analytics]
13:26 <elukey> reimage druid1006 to debian buster [analytics]
09:26 <elukey> roll restart cassandra on AQS to pick up openjdk upgrades [analytics]
2020-06-04 §
19:12 <elukey> roll restart of aqs to pick up new druid settings [analytics]
18:39 <mforns> deployed wikistats2 2.7.5 [analytics]
13:33 <elukey> re-enable netflow hive2druid jobs after https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/602356/ [analytics]
10:56 <elukey> depooled and reimage druid1004 to Debian Buster (Druid public cluster) [analytics]
07:31 <elukey> stop netflow hive2druid timers to do some experiments [analytics]
06:13 <elukey> kill application_1589903254658_75731 (druid indexation for netflow still running since 12h ago) [analytics]
05:36 <elukey> restart druid middlemanager on druid1002 - strange protobuf warnings, netflow hive2druid indexation job stuck for hours [analytics]
05:13 <elukey> reimage druid1003 to Buster [analytics]
2020-06-03 §
17:10 <elukey> restart RU jobs after adding memory to an-launcher1001 [analytics]
16:57 <elukey> reboot an-launcher1001 to get new memory [analytics]
16:01 <elukey> stop timers on an-launcher, prep for reboot [analytics]
09:35 <elukey> re-run webrequest-druid-hourly-coord 03/06T7 (failed due to druid1002 moving to buster) [analytics]
08:50 <elukey> reimage druid1002 to Buster [analytics]