2501-2550 of 5935 results (28ms)
2021-02-09 §
14:57 <elukey> restart an-worker1102 with 16g heap size to allow bootstrap [analytics]
14:54 <elukey> restart an-worker1090 with 16g heap size to allow bootstrap [analytics]
14:50 <elukey> restart analytics1072 with 16g heap size to allow bootstrap [analytics]
14:50 <elukey> restart analytics1069 with 16g heap size to allow bootstrap [analytics]
14:08 <elukey> restart analytics1069's datanode with bigger heap size [analytics]
13:39 <elukey> restart hdfs-datanode on analytics10[65,69] - failed to bootstrap due to issues reading datanode dirs [analytics]
13:38 <elukey> restart hdfs-datanode on an-worker1080 (test canary - not showing up in block report) [analytics]
10:04 <elukey> stop mysql replication an-coord1001 -> an-coord1002, an-coord1001 -> db1108 [analytics]
08:29 <elukey> leave hdfs safemode to let distcp do its job [analytics]
08:25 <elukey> set hdfs safemode on for the Analytics cluster [analytics]
08:19 <elukey> umount /mnt/hdfs from all nodes using it [analytics]
08:16 <joal> Kill flink yarn app [analytics]
08:08 <elukey> stop jupyterhub on stat100x [analytics]
08:07 <elukey> stop hive on an-coord100[1,2] - prep step for bigtop upgrade [analytics]
08:05 <elukey> stop oozie an-coord1001 - prep step for bigtop upgrade [analytics]
08:03 <elukey> stop presto-server on an-presto100x and an-coord1001 - prep step for bigtop upgrade [analytics]
07:28 <elukey> roll out new apt bigtop changes across all hadoop-related nodes [analytics]
07:19 <joal> Killing yarn users applications [analytics]
07:12 <elukey> stop airflow on an-airflow1001 (prep step for bigtop) [analytics]
07:09 <elukey> stop namenode on an-worker1124 (backup cluster), create two new partitions for backup and namenode, restart namenode [analytics]
06:14 <elukey> disable timers on labstore nodes (prep step for bigtop) [analytics]
06:11 <elukey> disable systemd timers on an-launcher1002 (prep step for bigtop) [analytics]
2021-02-08 §
22:29 <elukey> the previous entry was related to the Hadoop backup cluster [analytics]
22:29 <elukey> hdfs master failover an-worker1118 -> an-worker1124, created dedicated partition for /var/lib/hadoop/name (root partition filled up), restarted namenode on 1118 (now recovering edit logs) [analytics]
18:44 <razzi> rebalance kafka partitions for eventlogging_VirtualPageView [analytics]
15:11 <ottomata> set kafka topic retention to 31 days for (eqiad|codfw.rdf-streaming-updater.mutation) in kafka main-eqiad and main-codfw - T269619 [analytics]
2021-02-05 §
20:31 <razzi> rebalance kafka partitions for eventlogging_SearchSatisfaction [analytics]
19:11 <razzi> rebalance kafka partitions for eqiad.mediawiki.client.session_tick [analytics]
18:38 <razzi> rebalance kafka partitions for codfw.mediawiki.client.session_tick [analytics]
17:53 <razzi> rebalance kafka partitions for codfw.resource_change [analytics]
17:53 <razzi> rebalance kafka partitions for eqiad.resource_change [analytics]
11:31 <elukey> restart turnilo to pick up changes to the config (two new attributes to webrequest_128) [analytics]
2021-02-04 §
19:27 <razzi> rebalance kafka partitions for eqiad.mediawiki.job.wikibase-addUsagesForPage [analytics]
19:27 <razzi> rebalance kafka partitions for codfw.mediawiki.job.wikibase-addUsagesForPage [analytics]
19:22 <razzi> rebalance kafka partitions for eventlogging_MobileWikiAppLinkPreview [analytics]
17:04 <elukey> restart presto coordinator on an-coord1001 to pick up logging settings (log to http-request.log) [analytics]
17:02 <elukey> roll restart presto on an-presto* to finally get http-request.log [analytics]
11:28 <elukey> move aqs druid snapshot config to 2021-01 [analytics]
09:01 <elukey> restart superset and disable memcached caching [analytics]
08:08 <elukey> move an-worker1117 from Hadoop Analytics to Hadoop Backup [analytics]
2021-02-03 §
21:37 <razzi> rebalance kafka partitions for eventlogging_MobileWikiAppLinkPreview [analytics]
20:04 <razzi> rebalance kafka partitions for eqiad.mediawiki.job.RecordLintJob [analytics]
20:03 <razzi> rebalance kafka partitions for codfw.mediawiki.job.RecordLintJob [analytics]
18:28 <razzi> rebalance kafka partitions for eqiad.mediawiki.job.refreshLinks [analytics]
18:28 <razzi> rebalance kafka partitions for codfw.mediawiki.job.refreshLinks [analytics]
17:52 <razzi> rebalance kafka partitions for eqiad.wdqs-internal.sparql-query [analytics]
17:50 <razzi> rebalance kafka partitions for codfw.wdqs-internal.sparql-query [analytics]
14:48 <elukey> sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chmod -R o+rx /wmf/data/wmf/mediawiki/history_reduced [analytics]
14:45 <elukey> sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chmod o+rx /wmf/data/wmf/mediawiki [analytics]
14:40 <elukey> kill + restart webrequest-druid-{hourly,daily} to pick up new changes after refinery deployment [analytics]