851-900 of 5971 results (33ms)
2023-04-18 §
13:49 <btullis> re-enabling YARN queues [analytics]
13:43 <btullis> leaving HDFS safe mode on an-master1001 [analytics]
11:55 <btullis> entering safe mode for prod hadoop HDFS [analytics]
11:48 <btullis> depooled aqs10[14,15,19] [analytics]
11:45 <btullis> depooled schema1004 T333377 [analytics]
11:41 <btullis> refreshed yarn queues with `sudo cumin '(A:hadoop-master or A:hadoop-standby)' 'kerberos-run-command yarn /usr/bin/yarn rmadmin -refreshQueues'` [analytics]
11:36 <btullis> stopping YARN queues T333377 [analytics]
11:34 <btullis> disable gobblin timers T333377 [analytics]
08:39 <btullis> rebooting an-worker1110 to attempt upgrading RAID controller firmware [analytics]
2023-04-17 §
20:48 <joal> Restart AQS to pick up druid new datasource using scap [analytics]
18:34 <xcollazo> Removed old Airflow cached artifacts. Details at T334886. [analytics]
17:26 <SandraEbele> restarted turnilo with ‘sudo systemctl restart turnilo’ [analytics]
17:13 <SandraEbele> restarted Oozie page view-druid-daily job 0174450-220913162928808-oozie-oozi-C [analytics]
17:00 <xcollazo> scap deploy 'analytics: deploy Airflow ArchiveOperator should have a number of retries of 0. T332216' [analytics]
16:56 <SandraEbele> restarted oozie page view-druid-hourly job 0174449-220913162928808-oozie-oozi-C [analytics]
11:12 <btullis> running sre.hadoop.init-hadoop-workers an-worker1132.eqiad.wmnet [analytics]
10:32 <btullis> reimaging an-worker1132 [analytics]
2023-04-13 §
21:37 <SandraEbele> Successfully Deployed analytics refinery using scap, then deployed onto hdfs. [analytics]
15:42 <SandraEbele> paused Oozie pageview-druid-hourly job. [analytics]
15:27 <SandraEbele> deploying analytics refinery-update pageview druid table [analytics]
08:19 <steve_munene> Decommission an-worker1132 from the Hadoop cluster for T333091 reimage [analytics]
2023-04-12 §
15:16 <mforns> cleared airflow task aggregate_projectview_geographically from dag projectview_geo for 2023-04-12T08->09 [analytics]
14:50 <mforns> cleared airflow task aggregrate_pageview_to_projectview from projectview_hourly dag for 2023-04-12Y08->09 [analytics]
14:39 <mforns> cleared airflow task aggregate_pageview_actor_to_pageview_hourly from dag pageview_hourly for 2023-04-12T08->09 [analytics]
14:30 <mforns> re-ran airflow task compute_pageview_actor_hourly for dag pageview_actor_hourly for 2023-04-12T08->09 [analytics]
09:24 <aqu> About to migrate refine webrequest form Oozie to Airflow [analytics]
08:31 <aqu> About to deploy analytics/refinery in production [analytics]
2023-04-11 §
20:22 <mforns> deployed airflow analytics to remove network flows sanitization dag [analytics]
19:17 <SandraEbele> Unpaused pageview_hourly airflow dag. [analytics]
19:17 <SandraEbele> deployed airflow fix for pageview_hourly dag memory error [analytics]
16:28 <mforns> deployed airflow analytics to fix network flows internal dags in deployment [analytics]
15:27 <SandraEbele> Deployed refinery using scap, then deployed onto hdfs. [analytics]
13:46 <elukey> powercycle analytics1069, down for some days now, host stuck from the mgmt/serial console [analytics]
08:14 <aqu> About to deploy analytics/refinery (To migrate webrequest load from Oozie to Airflow) [analytics]
2023-04-10 §
19:20 <mforns> deployed airflow analytics to fix mediawiki wikitext history [analytics]
2023-04-07 §
10:34 <aqu> About to deploy analytics/refinery in test cluster [analytics]
2023-04-05 §
20:17 <mforns> deployed airflow to fix aqs pageview ranks [analytics]
20:08 <mforns> finished second refinery deployment to fix aqs rankings [analytics]
19:54 <mforns> starting second refinery deployment to fix aqs rankings [analytics]
19:35 <mforns> finished refinery deployment to fix aqs rankings\ [analytics]
19:18 <mforns> starting refinery deployment to fix aqs rankings [analytics]
16:24 <elukey> kafka test cluster migrated to bullseye [analytics]
14:00 <elukey> powercycle an-worker1132 [analytics]
2023-04-04 §
13:39 <steve_munene> leave hdfs safemode T331882 [analytics]
12:57 <steve_munene> putting hdfs into safe mode as part of T331882 [analytics]
11:42 <elukey> stop puppet on an-launcher1002 and manually stop .timer units [analytics]
07:34 <aqu> Rerun refine_event with "sudo -u analytics kerberos-run-command analytics /usr/local/bin/refine_event --ignore_failure_flag=true --table_include_regex='mediawiki_visual_editor_feature_use|mediawiki_edit_attempt|mediawiki_web_ui_interactions' --since='2023-04-02T18:00:00.000Z' --until='2023-04-03T19:00:00.000Z'" [analytics]
2023-04-03 §
08:01 <elukey> fix old envoyproxy monitor for an-test-ui1001 [analytics]
2023-03-31 §
12:23 <btullis> deploying datahub to staging T333580 [analytics]
08:44 <btullis> Shutting down an-worker1091 for RAID battery replacement T332883 [analytics]