1-50 of 3819 results (14ms)
2021-05-17 §
11:09 <joal> Restart cassandra-daily-wf-local_group_default_T_unique_devices-2021-5-4 for testing after host generating failures has been moved out of cluster [analytics]
10:41 <joal> Restart cassandra-daily-wf-local_group_default_T_unique_devices-2021-5-4 for testing after drop/create of keyspace [analytics]
10:28 <joal> Restart cassandra-daily-wf-local_group_default_T_unique_devices-2021-5-4 for testing [analytics]
09:45 <joal> Rerun of cassandra-daily-wf-local_group_default_T_pageviews_per_article_flat-2021-5-15 [analytics]
2021-05-13 §
11:41 <hnowlan> running truncate "local_group_default_T_pageviews_per_article_flat".data; on aqs1012 [analytics]
2021-05-12 §
15:17 <ottomata> dropped event.mediawiki_job_* tables and data directories with mforns - T273789 [analytics]
13:56 <ottomata> removing refine_mediawiki_job Refine jobs - T281605 [analytics]
2021-05-11 §
21:00 <mforns> finished repeated refinery deployment (matching source v0.1.11) - missed unmerged change [analytics]
19:59 <mforns> repeating refinery deployment (matching source v0.1.11) - missed unmerged change [analytics]
19:53 <mforns> finished refinery deployment (matching source v0.1.11) [analytics]
18:41 <mforns> starting refinery deployment (matching source v0.1.11) [analytics]
17:26 <mforns> deployed refinery-source v0.1.11 [analytics]
2021-05-06 §
21:27 <razzi> sudo manage_principals.py reset-password nahidunlimited --email_address=nsultan@wikimedia.org [analytics]
13:29 <elukey> roll restart of hadoop yarn nodemanagers to pick up TasksMax=26214 [analytics]
12:39 <elukey> restart Yarn RMs to apply the dominant resource calculator setting - T281792 [analytics]
12:15 <hnowlan> changed eventlogging CNAME to point to eventlog1003 [analytics]
09:19 <hnowlan> starting decommission of eventlog1002 [analytics]
2021-05-05 §
17:36 <razzi> create principal for sihe: sudo manage_principals.py create sihe --email_address=silvan.heintze@wikimedia.de [analytics]
12:22 <joal> Reset monitor_refine_eventlogging_legacy after manual rerun of failed job [analytics]
12:02 <joal> rerun cassandra-daily-wf-local_group_default_T_top_percountry-2021-5-4 [analytics]
2021-05-04 §
20:30 <joal> Kill-restart 16 cassandra jobs [analytics]
20:29 <joal> Kill-restart referer-daily job [analytics]
20:12 <joal> Deploy refinery onto HDFSb [analytics]
19:46 <joal> Deploying refinery using scap [analytics]
19:34 <joal> refinery v0.1.10 released to Archiva [analytics]
2021-05-03 §
14:23 <ottomata> stopping all venv based jupyter singleuser servers - T262847 [analytics]
13:59 <ottomata> dropped all obselete (upper cased location) event_santizied.*_T280813 tables created for T280813 [analytics]
10:43 <joal> Add _SUCCESS flag to /wmf/data/raw/mediawiki_private/tables/cu_changes/month=2021-04 after having manually sqooped missing tables [analytics]
09:57 <joal> restart refinery-sqoop-mediawiki-private timer after patch [analytics]
09:56 <joal> Reset refinery-sqoop-mediawiki-private timer [analytics]
09:38 <joal> Drop already sqooped data to restart jobs [analytics]
08:53 <joal> Deploy refinery for sqoop hotfix [analytics]
08:33 <elukey> clean up libmariadb-java from hadoop workers and clients [analytics]
07:46 <joal> Kill prod sqoop job to restart after fix [analytics]
2021-04-30 §
07:04 <elukey> hue restarted using the database 'hue' instead of 'hue_next' [analytics]
06:56 <elukey> stop hue to allow database rename (hue_next -> hue) [analytics]
2021-04-29 §
15:55 <razzi> restart hadoop-yarn-nodemanager and hadoop-hdfs-datanode on an-worker1100 for hadoop to recognize new disk /dev/sdl [analytics]
15:38 <ottomata> enabling event_sanitized_main jobs - T273789 [analytics]
14:57 <elukey> run mysql_upgrade on an-coord1001 to complete the buster upgrade - T278424 [analytics]
14:44 <hnowlan> restored all eventlogging jobs to eventlog1003 [analytics]
14:21 <hnowlan> bump eventlog1003 CPUs to 6 [analytics]
13:53 <joal> Rerun failed pageview-hourly-wf-2021-4-29-11 and pageview-hourly-wf-2021-4-29-12 [analytics]
13:09 <joal> Rerun failed pageview-hourly-wf-2021-4-29-11 [analytics]
12:35 <hnowlan> restarting 2 processors on eventlog1002 [analytics]
12:02 <hnowlan> stopping processors on eventlog1002 to migrate to eventlog1003 [analytics]
11:50 <elukey> manual stop of one of the eventlog processors on eventlog1002 to see if 1003 takes it over [analytics]
02:59 <milimetric> deployed hotfix for referrer job [analytics]
2021-04-28 §
17:46 <hnowlan> eventlog1003 joined to groups successfully [analytics]
17:36 <razzi> sudo mkdir /srv/log/eventlogging and sudo chown eventlogging:eventlogging /srv/log/eventlogging to workaround missing directory puppet error (to be puppetized later) [analytics]
17:31 <razzi> remove deployment cache on eventlogging1003: sudo rm -fr /srv/deployment/eventlogging/analytics-cache/ [analytics]