analytics SAL

1-50 of 4130 results (20ms)

2021-10-06 §
14:30	<elukey>	upgrade stat1005 to ROCm 4.2.0	[analytics]
13:20	<btullis>	btullis@aqs1004:~$ sudo nodetool-a clearsnapshot	[analytics]
10:20	<elukey>	upgrade ROCm to 4.2 on stat1008	[analytics]
2021-10-05 §
11:28	<elukey>	failover analytics-hive back to an-coord1001 after maintenance	[analytics]
2021-10-04 §
16:56	<elukey>	restart java daemons on an-coord1001 (standby)	[analytics]
13:43	<elukey>	failover analytics-hive to an-coord1002 (to restart java daemons on 1001)	[analytics]
07:43	<joal>	Kill-restart mediawiki-history-reduced job after deploy (more ressources)	[analytics]
07:32	<joal>	Deploy refinery to hdfs	[analytics]
07:10	<joal>	Deploy refinery for mediawiki-history-reduced hotfix	[analytics]
06:56	<joal>	Kill-restart pageview-monthly_dump-coord to apply fix for SLA	[analytics]
2021-10-01 §
15:11	<btullis>	sudo -u analytics kerberos-run-command analytics /usr/local/bin/refine_eventlogging_legacy --ignore_failure_flag=true --table_include_regex='editoractivation' --since='2021-09-29T22:00:00.000Z' --until='2021-09-30T23:00:00.000Z'	[analytics]
2021-09-30 §
19:55	<ottomata>	not changing to stats uid to 499; it already exists as a another system user	[analytics]
19:54	<ottomata>	changing stats uid and gid on an-launcher1002 and stat1005 to 499	[analytics]
09:32	<btullis>	btullis@an-launcher1002:~$ sudo -u analytics kerberos-run-command analytics /usr/local/bin/refine_netflow --ignore_failure_flag=true --since=2021-09-28T11:00:00 --until 2021-09-28T12:00:00	[analytics]
2021-09-29 §
09:16	<elukey>	restart hive-* units on an-coord1002 for openjdk upgrades (standby node)	[analytics]
2021-09-28 §
13:14	<btullis>	Deployed refinery using scap, then deployed onto hdfs	[analytics]
12:34	<btullis>	deploying refinery	[analytics]
09:55	<btullis>	btullis@cumin1001:~$ sudo cumin --mode async 'aqs100*.eqiad.wmnet' 'nodetool-a snapshot -t T291472 local_group_default_T_pageviews_per_article_flat' 'nodetool-b snapshot -t T291472 local_group_default_T_pageviews_per_article_flat'	[analytics]
09:36	<elukey>	restart java daemons on an-test-coord1001 to pick up new openjdk	[analytics]
2021-09-27 §
11:18	<btullis>	btullis@stat1005:~$ sudo apt purge usrmerge	[analytics]
11:11	<btullis>	btullis@stat1005:~$ sudo apt install usrmerge	[analytics]
2021-09-24 §
22:33	<razzi>	restart an-test-coord presto coordinator service to experiment withweb-ui.authentication.type=fixed	[analytics]
15:06	<btullis>	btullis@cumin1001:~$ sudo cumin --mode async 'aqs100[4,7].eqiad.wmnet' 'nodetool-a snapshot -t T291469' 'nodetool-b snapshot -t T291469'	[analytics]
14:47	<btullis>	btullis@aqs1007:~$ sudo nodetool-a repair --full local_group_default_T_mediarequest_per_file data	[analytics]
11:02	<btullis>	btullis@an-master1001:~$ sudo systemctl restart hadoop-mapreduce-historyserver	[analytics]
10:47	<btullis>	btullis@an-master1002:~$ sudo systemctl restart hadoop-hdfs-namenode	[analytics]
10:47	<btullis>	btullis@an-master1002:~$ sudo systemctl restart hadoop-hdfs-zkfc	[analytics]
10:35	<btullis>	btullis@an-master1001:~$ sudo -u hdfs kerberos-run-command hdfs /usr/bin/hdfs haadmin -failover an-master1002-eqiad-wmnet an-master1001-eqiad-wmnet	[analytics]
10:07	<btullis>	btullis@an-launcher1002:~$ sudo -u analytics kerberos-run-command analytics /usr/local/bin/refine_eventlogging_legacy --ignore_failure_flag=true --table_include_regex='centralnoticeimpression' --since='2021-09-23T04:00:00.000Z' --until='2021-09-24T05:00:00.000Z'	[analytics]
2021-09-22 §
17:23	<razzi>	razzi@an-test-coord1001:/etc/presto$ sudo systemctl restart presto-server	[analytics]
17:05	<joal>	Kill-restart oozie jobs after deploy (mediawiki-history-denormalize-coord, mediawiki-history-check_denormalize-coord, mediawiki-history-dumps-coord, mediawiki-history-reduced-coord)	[analytics]
11:54	<joal>	release refiner-source v0.1.18 to archiva with Jenkins	[analytics]
2021-09-20 §
08:12	<elukey>	remove old /reportcard (password protected, old files from 2012) httpd settings for stats.wikimedia.org	[analytics]
2021-09-18 §
06:48	<joal>	Rerun webrequest-load-wf-text-2021-9-18-0 for errors after yesterday night production issue	[analytics]
2021-09-17 §
16:03	<btullis>	Cleared all snapshots on aqs100[47] to reclaim space with nodetool-[ab] clearsnapshot (T249755)	[analytics]
15:15	<btullis>	btullis@aqs1004:~$ sudo nodetool-a repair --full && sudo nodetool-b repair --full (T249755)	[analytics]
10:18	<btullis>	btullis@an-web1001:~$ sudo find /srv/published-rsynced -user systemd-coredump -exec chown stats {} \;	[analytics]
09:47	<milimetric>	deployed refinery to sync sanitize allowlist, deleting event_sanitized data per decision in the task	[analytics]
08:21	<elukey>	disable mod_cgi/mod_cgid on an-web1001 (and remove cgi-perl related httpd configs/settings)	[analytics]
2021-09-16 §
19:25	<ottomata>	pointing analytics-web cname at new an-web1001, this moves stats and analytics .wm.org from thorium to an-web1001 - T285355	[analytics]
18:30	<joal>	Create HDFS home folder for user 'analytics-research'	[analytics]
07:03	<elukey>	stop jupyter-kaywong-singleuser.service on stat1005 to allow puppet to clean up	[analytics]
2021-09-15 §
16:26	<joal>	Deploying refinery	[analytics]
2021-09-13 §
18:25	<razzi>	(I stopped replication earlier but forgot to !log)	[analytics]
18:24	<razzi>	razzi@dbstore1007:~$ for socket in /run/mysqld/*; do sudo mysql --socket=$socket -e "START SLAVE"; done - reenable replication for T290841	[analytics]
18:19	<razzi>	razzi@dbstore1007:~$ sudo systemctl restart mariadb@s4.service for T290841	[analytics]
18:13	<razzi>	razzi@dbstore1007:~$ sudo systemctl restart mariadb@s3.service for T290841	[analytics]
18:05	<razzi>	sudo systemctl restart mariadb@s2.service	[analytics]
2021-09-07 §
11:41	<joal>	Restarting cassandra hourly loading job after C2 snapshot taken and C3 tables truncated	[analytics]
11:37	<joal>	Re-Add test rows in cassandra3 cluster after tables got truncated	[analytics]