1-50 of 453 results (12ms)
2016-10-17 §
16:57 <elukey> started the oozie coordinator 0034720-160922102909979-oozie-oozi-C to re-execute webrequest-load-wf-upload-2016-10-17-14 [analytics]
16:42 <ottomata> restarting hadoop nodemanagers 1 at a time [analytics]
15:32 <ottomata> rebootting analytics1030 [analytics]
08:24 <elukey> upgraded nodejs on aqs100[56] (already done on aqs1004) [analytics]
07:35 <elukey> created oozie coordinator 0034240-160922102909979-oozie-oozi-C to restart webrequest-load-check_sequence_statistics-wf-upload-2016-10-17-6 [analytics]
06:18 <elukey> created oozie coordinator 0034161-160922102909979-oozie-oozi-C to restart webrequest-load-check_sequence_statistics-wf-upload-2016-10-17-4 [analytics]
06:17 <elukey> created oozie coordinator 0034153-160922102909979-oozie-oozi-C to restart webrequest-load-check_sequence_statistics-wf-upload-2016-10-17-3 [analytics]
06:16 <elukey> created oozie coordinator 0034149-160922102909979-oozie-oozi-C to restart webrequest-load-check_sequence_statistics-wf-upload-2016-10-17-2 [analytics]
06:15 <elukey> created oozie coordinator 0034143-160922102909979-oozie-oozi-C to restart webrequest-load-check_sequence_statistics-wf-upload-2016-10-17-1 [analytics]
2016-10-13 §
06:57 <elukey> launched 0029282-160922102909979-oozie-oozi-C to re-run webrequest-load-check_sequence_statistics-wf-upload-2016-10-13-5 with higher error threshold [analytics]
06:56 <elukey> launched 0029278-160922102909979-oozie-oozi-C to re-run webrequest-load-check_sequence_statistics-wf-upload-2016-10-13-2 with higher error threshold [analytics]
2016-10-11 §
18:57 <elukey> kafka1018 back in service after maintenace [analytics]
13:44 <elukey> merged https://gerrit.wikimedia.org/r/315101 on stat1002 (removal of ::statistics:wikistats) [analytics]
13:00 <joal> Start prod version of wdqs_extract job [analytics]
12:16 <joal> Deploying refinery [analytics]
11:17 <elukey> started 0026094-160922102909979-oozie-oozi-C to fix webrequest-load-check_sequence_statistics-wf-upload-2016-10-11-9 (oozie data consistency errors) [analytics]
08:22 <joal> Killing cassandra loading jobs for old aqs [analytics]
2016-10-10 §
13:22 <elukey> moved pivot apache vhost to localhost:9090 [analytics]
09:16 <joal_> Deploy refinery [analytics]
2016-10-04 §
14:06 <elukey> applied role::spare to analytics1026 opened a task to decom it together with 1015 [analytics]
07:10 <elukey> rebooting eventlog1001 for kernel upgrades (EventLogging stopped) [analytics]
2016-09-23 §
09:06 <elukey> reboot eventlog2001.codfw.wmnet for kernel upgrades [analytics]
08:45 <elukey> upgrading varnishkafka to 1.0.12-1 in cache:misc [analytics]
08:32 <elukey> upgrading varnishkafka to 1.0.12-1 in cache:maps [analytics]
2016-09-22 §
15:30 <elukey> analytics1001 is back Yarn/HDFS master [analytics]
13:16 <elukey> previous comment was meant to be read as "set a permanent read only = false" [analytics]
13:16 <elukey> set read_only = false (on startup) for the analytics1003's mariadb instance [analytics]
13:12 <elukey> restarted oozie jobs for 2016-9-22-6 [analytics]
12:50 <elukey> varnishkafka 1.0.12 installed in cache:upload ulsfo and eqiad [analytics]
11:04 <elukey> re-enabling oozie and camus after cluster reboots [analytics]
10:57 <elukey> rebooted analytics1001 [analytics]
10:55 <elukey> Failover from analytics1001 to analytics1002 as prep step for 1001's reboot [analytics]
10:28 <elukey> setting global read_only = 0 to analytics1003 mariadb instance [analytics]
10:04 <elukey> rebooted analytics1003 (oozie, hive-metastore and hive-server2 daemons affected) [analytics]
09:51 <elukey> executed aptitude remove apache2 on analytic1027 (we use nginx in front of hue, apache steals port 8888 to hue and it does not start) [analytics]
09:49 <elukey> suspended all oozie bundles as prep step to reboot analytics1003 [analytics]
09:39 <elukey> rebooted analytics1027 [analytics]
09:14 <elukey> varnishkafka 1.0.12 installed in cache:upload codfw [analytics]
08:52 <elukey> varnishkafka 1.0.12 installed in cache:upload esams [analytics]
06:45 <elukey> stopped camus on analytics1027 and suspended webrequest-load-bundle via Hue (prep step for reboots) [analytics]
2016-09-21 §
17:43 <elukey> installed varnishkafka 1.0.12-1 on cp3034.esams [analytics]
06:25 <elukey> removed aqs100[123] from live traffic [analytics]
2016-09-20 §
17:03 <elukey> aqs100[56] added to LVS and serving live traffic [analytics]
16:22 <elukey> restarting cassandra on aqs1005 [analytics]
07:40 <elukey> restart cassandra on aqs100[456] for T130861 - only aqs1004 is taking live traffic [analytics]
2016-09-16 §
09:24 <elukey> added aqs100[456] to conftool-data (not pooled but the load balancer is doing health checks) [analytics]
2016-09-14 §
16:07 <elukey> cassandra on aqs100[123] restarted for T130861 [analytics]
2016-09-12 §
18:54 <ottomata> reenabled camus with new version of camus checker jar [analytics]
18:41 <ottomata> disabled camus crons on analytics1027 [analytics]
09:48 <elukey> restarted pivot on a tmux session on stat1002 since it died [analytics]